Skip to content
This repository was archived by the owner on Aug 25, 2024. It is now read-only.
This repository was archived by the owner on Aug 25, 2024. It is now read-only.

source: file: Compression #15

Closed
Closed
@johnandersen777

Description

@johnandersen777

DFFML is hoping to participate in Google Summer of Code (GSoC) under the Python Software Foundation umbrella. You can read all about what this means at http://python-gsoc.org/. This issue, and any others tagged gsoc and project are not generally available bugs, but related to project ideas for GSoC.

Project Idea: File Source Compression

Project description:

DFFML's initial release includes a FileSource which saves and loads data from files using the load_fd and dump_fd methods.

JSON Example

async def load_fd(self, fd):
repos = json.load(fd)
self.mem = {src_url: Repo(src_url, data=data) \
for src_url, data in repos.items()}
LOGGER.debug('%r loaded %d records', self, len(self.mem))
async def dump_fd(self, fd):
json.dump({repo.src_url: repo.dict() for repo in self.mem.values()}, fd)
LOGGER.debug('%r saved %d records', self, len(self.mem))

For the open method of FileSource

async def _open(self):
if not os.path.exists(self.filename) \
or os.path.isdir(self.filename):
LOGGER.debug('%r is not a file, initializing memory to empty dict',
self.filename)
self.mem = {}
return
with open(self.filename, 'r') as fd:
await self.load_fd(fd)

Allow for reading and writing the following file formats, transparently (so without subclasses having to do anything) to any source which is a subclass of FileSource.

Skills: Python, git
Difficulty level: Easy

Related Readings/Links:

See https://docs.python.org/3/library/archiving.html for documentation

Potential mentors: @pdxjohnny

Getting Started: Figure out how to do one of the file types, probably gzip (as that probably is as simple as using https://docs.python.org/3/library/gzip.html#gzip.GzipFile if the filename ends in .gz) then move on to the rest. For now just make modifications directly to the FileSource class. We may have you split out the logic later, but don't worry about another class for now.

What we want to see in your application: Describe how you intend to solve the problem, and give us some "stretch goals", maybe implement a remote file source which reads form URLs. Don't forget to include some time for building appropriate tests.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestgsocGoogle Summer of Code relatedprojectIssues which will take a while to complete

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions