Skip to content
This repository was archived by the owner on Jan 11, 2023. It is now read-only.

Export filter urls by regex #859

Closed
wants to merge 1 commit into from

Conversation

nealalpert
Copy link
Contributor

This PR adds an optional parameter to the export command to allow export to only generate static pages for a subset of URLs matching one or more regular expressions.

The optional parameter --filter allows for one or more space delimited regular expressions to be used to filter out URLs found during the recursive page crawling in the handle function in export.ts.

Example: Running npx sapper export --filter '^en/* ^es/*' would export the index page at root and any link that started with en/ or es/. Any url found in the body of a page would be compared to /(?:^en\/*)|(?:^es\/*)/ in the handle function to determine if it should be crawled.

Use case:
A sapper/svelte project has used the following directory structure to deal with i18n as per issue #576:

.
└── routes
    └── [lang]
        ├── things
        │   ├── _validThings.js
        │   ├── index.json.js
        │   └── index.svelte
        ├── _error.svelte
        ├── layout.json.js
        └── index.svelte

If the content for a single language had been updated the --filter parameter would allow a single localization of the site to be exported.

The --filter parameter can also be used to segment portions of a site in order to parallelize the export process.

@khrome83
Copy link

@Conduitry - can we get this looked at?

@swyxio
Copy link

swyxio commented Oct 3, 2019

@nealalpert - i was able to merge in your other 2 pr's to my fork, but this one has a bunch of conflicts and is the oldest. will you help update your fork on top of the other changes you've made? i've discsused this with Khrome. https://github.com/sw-yx/sapper

@nealalpert
Copy link
Contributor Author

nealalpert commented Oct 7, 2019

Sure thing. Sorry about the delay, i was on a bit of vacation.

As far as I can tell there aren't any merge conflicts between this branch and the master branch. Would you like me a create a new branch with all three PRs merged together or make a PR for your fork?

@swyxio
Copy link

swyxio commented Oct 7, 2019

i was referring to conflicts between PRs. :)

i think you should do whatever makes sense for your workflow. for the time being i'm committed to keeping http://npm.im/@ssgjs/sapper a superset of sapper, but may break from that in the future. if you'd like to just publish your own fork with your changes instead, that may well make sense.

@aubergene
Copy link
Contributor

aubergene commented Nov 5, 2019

What about if you had a link attribute such as no-crawl which prevented crawling? Then perhaps you can pass a value to sapper at export to activate it.

<a href="en/info" no-crawl={langs.includes('en')}>Find out more</a>
<a href="es/info" no-crawl={langs.includes('es')}>Saber más</a>

The reason I like this is because I have links to urls which are essentially within the subtree of the Svelte app but are actually handled upstream of it.

The way I currently handle this is to hard code the absolute url prefix, which perhaps is ok I should be happy with that.

@khrome83
Copy link

<a href="en/info" no-crawl={langs.includes('en')}>Find out more</a>
<a href="es/info" no-crawl={langs.includes('es')}>Saber más</a>

@aubergene - Not sure this is a good idea. For one, you're now introducing code within the site that impacts export but would make it to the final product. Unless you now add code to strip it also.

For two, you have to place that anywhere you link to those sections. That means you have to change code when your export changes.

As our site grows, for example, we can modify the export process using the RegEx without touching the site's code itself. I have 1,000s of pages that are not represented on the file system due to dynamic route params. If the authors add a new content source, it benefits me to "shard" the export more to split the export. We use 4 fargates right now to do an export that takes over an hour on my local machine in a few minutes.

I hope that makes sense. So for point three, the export has to be different in the different places, it's not just about language, which means the ability to control it externally is cleaner.

@aubergene
Copy link
Contributor

@khrome83 Yes I agree, re-reading my message above it makes no sense now, I think I had some specific issue which I solved in a different way and was searching around for possible answers at the time

@benmccann
Copy link
Member

Thanks for putting together this PR! Sapper never had a good method of handling configuration, which made it difficult to know how to integrate this feature. However, that's been addressed in SvelteKit by adding a svelte.config.cjs file and adapters. I'm going to go ahead and close this PR since we won't be adding the feature to Sapper. You might like to check out SvelteKit instead. We'll be reviewing PRs for missing features there

@benmccann benmccann closed this Mar 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants