Apache Parquet v1.10.0 Release Notes

Release Date: 2018-04-05 // about 6 years ago
  • ๐Ÿš€ Release Notes - Parquet - Version 1.10.0

    ๐Ÿ› Bug

    • PARQUET-196 - parquet-tools command to get rowcount & size
    • PARQUET-357 - Parquet-thrift generates wrong schema for Thrift binary fields
    • โฌ†๏ธ PARQUET-765 - Upgrade Avro to 1.8.1
    • ๐Ÿ‘€ PARQUET-783 - H2SeekableInputStream does not close its underlying FSDataInputStream, leading to connection leaks
    • PARQUET-786 - parquet-tools README incorrectly has 'java jar' instead of 'java -jar'
    • PARQUET-791 - Predicate pushing down on missing columns should work on UserDefinedPredicate too
    • ๐Ÿ“œ PARQUET-1005 - Fix DumpCommand parsing to allow column projection
    • PARQUET-1028 - [JAVA] When reading old Spark-generated files with INT96, stats are reported as valid when they aren't
    • ๐Ÿ—„ PARQUET-1065 - Deprecate type-defined sort ordering for INT96 type
    • PARQUET-1077 - [MR] Switch to long key ids in KEYs file
    • ๐Ÿ“‡ PARQUET-1141 - IDs are dropped in metadata conversion
    • PARQUET-1152 - Parquet-thrift doesn't compile with Thrift 0.9.3
    • PARQUET-1153 - Parquet-thrift doesn't compile with Thrift 0.10.0
    • PARQUET-1156 - dev/merge_parquet_pr.py problems
    • โœ… PARQUET-1185 - TestBinary#testBinary unit test fails after PARQUET-1141
    • PARQUET-1191 - Type.hashCode() takes originalType into account but Type.equals() does not
    • โœ… PARQUET-1208 - Occasional endless loop in unit test
    • PARQUET-1217 - Incorrect handling of missing values in Statistics
    • PARQUET-1246 - Ignore float/double statistics in case of NaN
    • โšก๏ธ PARQUET-1258 - Update scm developer connection to github

    ๐Ÿ†• New Feature

    • ๐Ÿ‘ PARQUET-1025 - Support new min-max statistics in parquet-mr

    ๐Ÿ‘Œ Improvement

    • โš  PARQUET-220 - Unnecessary warning in ParquetRecordReader.initialize
    • 0๏ธโƒฃ PARQUET-321 - Set the HDFS padding default to 8MB
    • ๐Ÿ“‡ PARQUET-386 - Printing out the statistics of metadata in parquet-tools
    • PARQUET-423 - Make writing Avro to Parquet less noisy
    • PARQUET-755 - create parquet-arrow module with schema converter
    • PARQUET-777 - Add new Parquet CLI tools
    • PARQUET-787 - Add a size limit for heap allocations when reading
    • PARQUET-801 - Allow UserDefinedPredicates in DictionaryFilter
    • PARQUET-852 - Slowly ramp up sizes of byte[] in ByteBasedBitPackingEncoder
    • ๐Ÿ‘ PARQUET-884 - Add support for Decimal datatype to Parquet-Pig record reader
    • ๐Ÿ‘ PARQUET-969 - Decimal datatype support for parquet-tools output
    • ๐Ÿ“œ PARQUET-990 - More detailed error messages in footer parsing
    • PARQUET-1024 - allow for case insensitive parquet-xxx prefix in PR title
    • PARQUET-1026 - allow unsigned binary stats when min == max
    • ๐Ÿ”€ PARQUET-1115 - Warn users when misusing parquet-tools merge
    • โฌ†๏ธ PARQUET-1135 - upgrade thrift and protobuf dependencies
    • PARQUET-1142 - Avoid leaking Hadoop API to downstream libraries
    • โฌ†๏ธ PARQUET-1149 - Upgrade Avro dependancy to 1.8.2
    • ๐Ÿ”Š PARQUET-1170 - Logical-type-based toString for proper representeation in tools/logs
    • ๐Ÿ— PARQUET-1183 - AvroParquetWriter needs OutputFile based Builder
    • ๐ŸŒฒ PARQUET-1197 - Log rat failures
    • PARQUET-1198 - Bump java source and target to java8
    • PARQUET-1215 - Add accessor for footer after a file is closed
    • ๐Ÿ— PARQUET-1263 - ParquetReader's builder should use Configuration from the InputFile

    Task