Skip to content

Conversation

syed0596
Copy link

duplicated_education_entries Hi, thanks for the great library!

I noticed that when scraping a profile, the get_educations() method was adding the same education entry multiple times to the final list. This seems to be caused by the scraper matching multiple HTML elements for a single entry on the education details page.

Solution:
This pull request fixes the issue by:

  1. Initializing a scraped_education_keys set in the Person class.
  2. In get_educations(), it creates a unique key for each scraped entry.
  3. It only adds the new Education object if its key hasn't been seen before.

This ensures that each education is only recorded once. I also restored the scrape_logged_in method which appeared to have been accidentally deleted in the version I had.

Thanks for your consideration!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant