Apache Parquet v1.11.0 Release Notes
Release Date: 2019-12-06 // over 4 years ago-
๐ Release Notes - Parquet - Version 1.11.0
๐ Bug
- ๐ PARQUET-138 - Parquet should allow a merge between required and optional schemas
- PARQUET-952 - Avro union with single type fails with 'is not a group'
- โฌ๏ธ PARQUET-1128 - [Java] Upgrade the Apache Arrow version to 0.8.0 for SchemaConverter
- PARQUET-1281 - Jackson dependency
- PARQUET-1285 - [Java] SchemaConverter should not convert from TimeUnit.SECOND AND TimeUnit.NANOSECOND of Arrow
- ๐ PARQUET-1293 - Build failure when using Java 8 lambda expressions
- ๐ PARQUET-1296 - Travis kills build after 10 minutes, because "no output was received"
- PARQUET-1297 - [Java] SchemaConverter should not convert from Timestamp(TimeUnit.SECOND) and Timestamp(TimeUnit.NANOSECOND) of Arrow
- PARQUET-1303 - Avro reflect @Stringable field write error if field not instanceof CharSequence
- ๐ PARQUET-1304 - Release 1.10 contains breaking changes for Hive
- PARQUET-1305 - Backward incompatible change introduced in 1.8
- PARQUET-1309 - Parquet Java uses incorrect stats and dictionary filter properties
- โก๏ธ PARQUET-1311 - Update README.md
- ๐ PARQUET-1317 - ParquetMetadataConverter throw NPE
- PARQUET-1341 - Null count is suppressed when columns have no min or max and use unsigned sort order
- ๐ PARQUET-1344 - Type builders don't honor new logical types
- PARQUET-1368 - ParquetFileReader should close its input stream for the failure in constructor
- PARQUET-1371 - Time/Timestamp UTC normalization parameter doesn't work
- PARQUET-1407 - Data loss on duplicate values with AvroParquetWriter/Reader
- PARQUET-1417 - BINARY_AS_SIGNED_INTEGER_COMPARATOR fails with IOBE for the same arrays with the different length
- ๐ PARQUET-1421 - InternalParquetRecordWriter logs debug messages at the INFO level
- PARQUET-1440 - Parquet-tools: Decimal values stored in an int32 or int64 in the parquet file aren't displayed with their proper scale
- ๐ PARQUET-1441 - SchemaParseException: Can't redefine: list in AvroIndexedRecordConverter
- PARQUET-1456 - Use page index, ParquetFileReader throw ArrayIndexOutOfBoundsException
- PARQUET-1460 - Fix javadoc errors and include javadoc checking in Travis checks
- โก๏ธ PARQUET-1461 - Third party code does not compile after parquet-mr minor version update
- PARQUET-1470 - Inputstream leakage in ParquetFileWriter.appendFile
- PARQUET-1472 - Dictionary filter fails on FIXED_LEN_BYTE_ARRAY
- PARQUET-1475 - DirectCodecFactory's ParquetCompressionCodecException drops a passed in cause in one constructor
- PARQUET-1478 - Can't read spec compliant, 3-level lists via parquet-proto
- ๐ PARQUET-1480 - INT96 to avro not yet implemented error should mention deprecation
- PARQUET-1485 - Snappy Decompressor/Compressor may cause direct memory leak
- PARQUET-1488 - UserDefinedPredicate throw NPE
- โก๏ธ PARQUET-1496 - [Java] Update Scala for JDK 11 compatibility
- PARQUET-1497 - [Java]ย javax annotations dependency missing for Java 11
- PARQUET-1498 - [Java] Add instructions to install thrift via homebrew
- PARQUET-1510 - Dictionary filter skips null values when evaluating not-equals.
- PARQUET-1514 - ParquetFileWriter Records Compressed Bytes instead of Uncompressed Bytes
- PARQUET-1527 - [parquet-tools] cat command throw java.lang.ClassCastException
- PARQUET-1529 - Shade fastutil in all modules where used
- PARQUET-1531 - Page row count limit causes empty pages to be written from MessageColumnIO
- โ PARQUET-1533 - TestSnappy() throws OOM exception with Parquet-1485 change
- ๐ PARQUET-1534 - [parquet-cli] Argument error: Illegal character in opaque part at index 2 on Windows
- PARQUET-1544 - Possible over-shading of modules
- PARQUET-1550 - CleanUtil does not work in Java 11
- PARQUET-1555 - Bump snappy-java to 1.1.7.3
- PARQUET-1596 - PARQUET-1375 broke parquet-cli's to-avro command
- PARQUET-1600 - Fix shebang in parquet-benchmarks/run.sh
- PARQUET-1615 - getRecordWriter shouldn't hardcode CREAT mode when new ParquetFileWriter
- ๐ PARQUET-1637 - Builds are failing because default jdk changed to openjdk11 on Travis
- ๐ PARQUET-1644 - Clean up some benchmark code and docs.
- ๐ PARQUET-1691 - Build fails due to missing hadoop-lzo
๐ New Feature
- PARQUET-1201 - Column indexes
- ๐ PARQUET-1253 - Support for new logical type representation
- PARQUET-1388 - Nanosecond precision time and timestamp - parquet-mr
๐ Improvement
- โฌ๏ธ PARQUET-1135 - upgrade thrift and protobuf dependencies
- ๐ PARQUET-1280 - [parquet-protobuf] Use maven protoc plugin
- PARQUET-1321 - LogicalTypeAnnotation.LogicalTypeAnnotationVisitor#visit methods should have a return value
- PARQUET-1335 - Logical type names in parquet-mr are not consistent with parquet-format
- PARQUET-1336 - PrimitiveComparator should implements Serializable
- PARQUET-1365 - Don't write page level statistics
- โฌ๏ธ PARQUET-1375 - Upgrade to supported version of Jackson
- PARQUET-1383 - Parquet tools should indicate UTC parameter for time/timestamp types
- โฌ๏ธ PARQUET-1390 - [Java] Upgrade to Arrow 0.10.0
- ๐ PARQUET-1399 - Move parquet-mr related code from parquet-format
- ๐จ PARQUET-1410 - Refactor modules to use the new logical type API
- PARQUET-1414 - Limit page size based on maximum row count
- โ PARQUET-1418 - Run integration tests in Travis
- PARQUET-1435 - Benchmark filtering column-indexes
- PARQUET-1444 - Prefer ArrayList over LinkedList
- ๐ PARQUET-1445 - Remove Files.java
- ๐ PARQUET-1462 - Allow specifying new development version in prepare-release.sh
- โฌ๏ธ PARQUET-1466 - Upgrade to the latest guava 27.0-jre
- ๐ฒ PARQUET-1474 - Less verbose and lower level logging for missing column/offset indexes
- โ PARQUET-1476 - Don't emit a warning message for files without new logical type
- PARQUET-1487 - Do not write original type for timezone-agnostic timestamps
- ๐ PARQUET-1489 - Insufficient documentation for UserDefinedPredicate.keep(T)
- PARQUET-1490 - Add branch-specific Travis steps
- ๐ PARQUET-1492 - Remove protobuf install in travis build
- PARQUET-1499 - [parquet-mr] Add Java 11 to Travis
- ๐ PARQUET-1500 - Remove the Closables
- PARQUET-1502 - Convert FIXED_LEN_BYTE_ARRAY to arrow type in logicalTypeAnnotation if it is not null
- ๐ PARQUET-1503 - Remove Ints Utility Class
- PARQUET-1504 - Add an option to convert Parquet Int96 to Arrow Timestamp
- PARQUET-1505 - Use Java 7 NIO StandardCharsets
- ๐ PARQUET-1506 - Migrate from maven-thrift-plugin to thrift-maven-plugin
- PARQUET-1507 - Bump Apache Thrift to 0.12.0
- โก๏ธ PARQUET-1509 - Update Docs for Hive Deprecation
- PARQUET-1513 - HiddenFileFilter Streamline
- PARQUET-1518 - Bump Jackson2 version of parquet-cli
- ๐ PARQUET-1530 - Remove Dependency on commons-codec
- ๐ PARQUET-1542 - Merge multiple I/O to one time I/O when read footer
- ๐ PARQUET-1557 - Replace deprecated Apache Avro methods
- โ PARQUET-1558 - Use try-with-resource in Apache Avro tests
- โฌ๏ธ PARQUET-1576 - Upgrade to Avro 1.9.0
- ๐ PARQUET-1577 - Remove duplicate license
- PARQUET-1578 - Introduce Lambdas
- PARQUET-1579 - Add Github PR template
- PARQUET-1580 - Page-level CRC checksum verification for DataPageV1
- ๐ PARQUET-1601 - Add zstd support to parquet-cli to-avro
- PARQUET-1604 - Bump fastutil from 7.0.13 to 8.2.3
- ๐ PARQUET-1605 - Bump maven-javadoc-plugin from 2.9 to 3.1.0
- โ PARQUET-1606 - Fix invalid tests scope
- ๐ PARQUET-1607 - Remove duplicate maven-enforcer-plugin
- PARQUET-1616 - Enable Maven batch mode
- โ PARQUET-1650 - Implement unit test to validate column/offset indexes
- ๐ PARQUET-1654 - Remove unnecessary options when building thrift
- โฌ๏ธ PARQUET-1661 - Upgrade to Avro 1.9.1
- โฌ๏ธ PARQUET-1662 - Upgrade Jackson to version 2.9.10
- โฌ๏ธ PARQUET-1665 - Upgrade zstd-jni to 1.4.0-1
- ๐ PARQUET-1669 - Disable compiling all libraries when building thrift
- โฌ๏ธ PARQUET-1671 - Upgrade Yetus to 0.11.0
- PARQUET-1682 - Maintain forward compatibility for TIME/TIMESTAMP
- ๐ PARQUET-1683 - Remove unnecessary string converting in readFooter method
- PARQUET-1685 - Truncate the stored min and max for String statistics to reduce the footer size
โ Test
- โ PARQUET-1536 - [parquet-cli] Add simple tests for each command
Wish
- โฌ๏ธ PARQUET-1552 - upgrade protoc-jar-maven-plugin to 3.8.0
- โฌ๏ธ PARQUET-1673 - Upgrade parquet-mr format version to 2.7.0
Task
- ๐ PARQUET-968 - Add Hive/Presto support in ProtoParquet
- ๐ PARQUET-1294 - Update release scripts for the new Apache policy
- ๐ PARQUET-1434 - Release parquet-mr 1.11.0
- PARQUET-1436 - TimestampMicrosStringifier shows wrong microseconds for timestamps before 1970
- ๐ PARQUET-1452 - Deprecate old logical types API
- ๐ PARQUET-1551 - Support Java 11 - top-level JIRA
- PARQUET-1570 - Publish 1.11.0 to maven central
- โก๏ธ PARQUET-1585 - Update old external links in the code base
- PARQUET-1645 - Bump Apache Avro to 1.9.1
- PARQUET-1649 - Bump Jackson Databind to 2.9.9.3
- ๐ PARQUET-1687 - Update release process