Skip to content

Commit 6fdb9da

Browse files
saritvakratwilwade
andauthored
Update index.ts to support RLE_DICTIONARY (#112)
Problem ======= problem statement - when trying to read a parquet file that was generated using V2 parquet and had RLE_DICTIONARY, got an error: invalid encoding: RLE_DICTIONARY #96 Reported issue: #96 Solution ======== What I/we did to solve this problem added: export * as RLE_DICTIONARY from './plain_dictionary'; ---------------- I added this line to an existing project in the node modules and it works. without this line I get an an error with this line added - it passed --------- Co-authored-by: Wil Wade <[email protected]>
1 parent 0a42955 commit 6fdb9da

5 files changed

+24
-5
lines changed

lib/codec/index.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
export * as PLAIN from './plain'
22
export * as RLE from './rle'
33
export * as PLAIN_DICTIONARY from './plain_dictionary'
4-
4+
export * as RLE_DICTIONARY from './plain_dictionary'
55

test/test-files.js

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ describe('test-files', function() {
146146
const scale = schema.fields["value"].scale;
147147
assert.equal(scale, 2);
148148
const divider = 10 ** scale;
149-
149+
150150
for (let i = 0; i < data.length; i++) {
151151
const valueToMatch = i + 1;
152152
// Decimal values whose primitive types are fixed length byte array will
@@ -160,11 +160,11 @@ describe('test-files', function() {
160160
assert.equal(numericalValue, valueToMatch);
161161
}
162162
});
163-
163+
164164
it('byte_array_decimal.parquet loads', async function () {
165165
const schema = await readSchema('byte_array_decimal.parquet');
166166
const data = await readData('byte_array_decimal.parquet');
167-
167+
168168
const scale = schema.fields["value"].scale;
169169
assert.equal(scale, 2);
170170
const divider = 10 ** scale;
@@ -173,7 +173,7 @@ describe('test-files', function() {
173173
const valueToMatch = i + 1;
174174
// Decimal values whose primitive types are byte array will
175175
// be returned as raw buffer values.
176-
// For the test data, the actual decimal values and the corresponding buffer lengths
176+
// For the test data, the actual decimal values and the corresponding buffer lengths
177177
// are small enough so we can treat the buffer as a positive integer and compare the values.
178178
// In reality, the user will need to use a more novel approach to parse the
179179
// buffer to an object that can handle large fractional numbers.
@@ -188,4 +188,23 @@ describe('test-files', function() {
188188
assert.equal(decimalValue, valueToMatch);
189189
}
190190
});
191+
192+
describe("RLE", function () {
193+
// Tracked in https://github.com/LibertyDSNP/parquetjs/issues/113
194+
it.skip('rle_boolean_encoding.parquet loads', async function() {
195+
const data = await readData('rle/rle_boolean_encoding.parquet');
196+
assert.deepEqual(data[0],{ datatype_boolean: true });
197+
assert.deepEqual(data[1],{ datatype_boolean: false });
198+
});
199+
200+
it('rle-dict-snappy-checksum.parquet loads', async function() {
201+
const data = await readData('rle/rle-dict-snappy-checksum.parquet');
202+
assert.deepEqual(data[0],{ binary_field: "c95e263a-f5d4-401f-8107-5ca7146a1f98", long_field: "0" });
203+
});
204+
205+
it('rle-dict-uncompressed-corrupt-checksum.parquet loads', async function() {
206+
const data = await readData('rle/rle-dict-uncompressed-corrupt-checksum.parquet');
207+
assert.deepEqual(data[0],{ binary_field: "6325c32b-f417-41aa-9e02-9b8601542aff", long_field: "0" });
208+
});
209+
})
191210
});
Binary file not shown.
Binary file not shown.
192 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)