source: file: Compression #15
Description
DFFML is hoping to participate in Google Summer of Code (GSoC) under the Python Software Foundation umbrella. You can read all about what this means at http://python-gsoc.org/. This issue, and any others tagged gsoc
and project
are not generally available bugs, but related to project ideas for GSoC.
Project Idea: File Source Compression
Project description:
DFFML's initial release includes a FileSource
which saves and loads data from files using the load_fd
and dump_fd
methods.
JSON Example
Lines 19 to 27 in dd8007d
For the open
method of FileSource
Lines 36 to 44 in dd8007d
Allow for reading and writing the following file formats, transparently (so without subclasses having to do anything) to any source which is a subclass of FileSource
.
- gzip (by @yashlamba)
- bz2
- lzma
- zip
Skills: Python, git
Difficulty level: Easy
Related Readings/Links:
See https://docs.python.org/3/library/archiving.html for documentation
Potential mentors: @pdxjohnny
Getting Started: Figure out how to do one of the file types, probably gzip (as that probably is as simple as using https://docs.python.org/3/library/gzip.html#gzip.GzipFile if the filename ends in .gz
) then move on to the rest. For now just make modifications directly to the FileSource
class. We may have you split out the logic later, but don't worry about another class for now.
What we want to see in your application: Describe how you intend to solve the problem, and give us some "stretch goals", maybe implement a remote file source which reads form URLs. Don't forget to include some time for building appropriate tests.