Skip to content

Refactored and fixed all documentation #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 12, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 38 additions & 25 deletions sdks/javascript.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -41,27 +41,30 @@ yarn add scrapegraph-js

## Quick Start

Initialize with your API key:
### Basic example

<Note>
Store your API keys securely in environment variables. Use `.env` files and libraries like `dotenv` to load them into your app.
</Note>

```javascript
import { smartScraper } from 'scrapegraph-js';
import 'dotenv/config';

const apiKey = process.env.SGAI_APIKEY;
// Initialize variables
const apiKey = process.env.SGAI_APIKEY; // Set your API key as an environment variable
const websiteUrl = 'https://example.com';
const prompt = 'Extract the main heading and description';
const prompt = 'What does the company do?';

try {
const response = await smartScraper(apiKey, websiteUrl, prompt);
const response = await smartScraper(apiKey, websiteUrl, prompt); // call SmartScraper function
console.log(response.result);
} catch (error) {
console.error('Error:', error);
}
;
```

<Note>
Store your API keys securely in environment variables. Use `.env` files and libraries like `dotenv` to load them into your app.
</Note>

## Services

### SmartScraper
Expand All @@ -76,10 +79,19 @@ const response = await smartScraper(
);
```

#### Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| apiKey | string | Yes | The ScrapeGraph API Key. |
| websiteUrl | string | Yes | The URL of the webpage that needs to be scraped. |
| prompt | string | Yes | A textual description of what you want to achieve. |
| schema | object | No | The Pydantic or Zod object that describes the structure and format of the response. |

<Accordion title="Basic Schema Example" icon="code">
Define a simple schema using Zod:

```typescript
```javascript
import { z } from 'zod';

const ArticleSchema = z.object({
Expand Down Expand Up @@ -108,7 +120,7 @@ console.log(`Published: ${response.result.publishDate}`);
<Accordion title="Advanced Schema Example" icon="code">
Define a complex schema for nested data structures:

```typescript
```javascript
import { z } from 'zod';

const EmployeeSchema = z.object({
Expand Down Expand Up @@ -169,10 +181,18 @@ const response = await searchScraper(
);
```

#### Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| apiKey | string | Yes | The ScrapeGraph API Key. |
| prompt | string | Yes | A textual description of what you want to achieve. |
| schema | object | No | The Pydantic or Zod object that describes the structure and format of the response |

<Accordion title="Basic Schema Example" icon="code">
Define a simple schema using Zod:

```typescript
```javascript
import { z } from 'zod';

const ArticleSchema = z.object({
Expand All @@ -199,7 +219,7 @@ console.log(`Published: ${response.result.publishDate}`);
<Accordion title="Advanced Schema Example" icon="code">
Define a complex schema for nested data structures:

```typescript
```javascript
import { z } from 'zod';

const EmployeeSchema = z.object({
Expand Down Expand Up @@ -230,19 +250,6 @@ const response = await searchScraper(
'Find the best restaurants in San Francisco',
RestaurantSchema
);



// Access nested data
console.log(`Restaurant: ${response.result.name}`);
console.log('\nAddress:');
response.result.address.forEach(address => {
console.log(`- ${address}`);
});


console.log('\nRating:');
console.log(`- ${response.result.rating}`);
```
</Accordion>

Expand All @@ -258,6 +265,12 @@ const response = await markdownify(
'https://example.com'
);
```
#### Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| apiKey | string | Yes | The ScrapeGraph API Key. |
| websiteUrl | string | Yes | The URL of the webpage to convert to markdown. |

## API Credits

Expand Down
173 changes: 34 additions & 139 deletions services/markdownify.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -33,23 +33,18 @@ response = client.markdownify(
```

```javascript JavaScript
import { Client } from 'scrapegraph-js';
import { markdownify } from 'scrapegraph-js';

// Initialize the client
const sgai_client = new Client("your-api-key");
const apiKey = 'your-api-key';
const url = 'https://scrapegraphai.com/';

try {
const response = await sgai_client.markdownify({
websiteUrl: "https://example.com"
});

console.log('Request ID:', response.requestId);
console.log('Result:', response.result);
const response = await markdownify(apiKey, url);
console.log(response);
} catch (error) {
console.error(error);
} finally {
sgai_client.close();
}

```

```bash cURL
Expand All @@ -65,6 +60,13 @@ curl -X 'POST' \

</CodeGroup>

#### Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| apiKey | string | Yes | The ScrapeGraph API Key. |
| websiteUrl | string | Yes | The URL of the webpage to convert to markdown. |

<Note>
Get your API key from the [dashboard](https://dashboard.scrapegraphai.com)
</Note>
Expand Down Expand Up @@ -128,152 +130,45 @@ The response includes:
Want to learn more about our AI-powered scraping technology? Visit our [main website](https://scrapegraphai.com) to discover how we're revolutionizing web data extraction.
</Note>

## Advanced Usage

### Request Helper Function

The `getMarkdownifyRequest` function helps create properly formatted request objects for the Markdownify service:

```javascript
import { getMarkdownifyRequest } from 'scrapegraph-js';
## Other Functionality

const request = getMarkdownifyRequest({
websiteUrl: "https://example.com/article",
headers: {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
}
});
```
### Retrieve a previous request

#### Parameters
If you know the response id of a previous request you made, you can retrieve all the information.

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| websiteUrl | string | Yes | The URL of the webpage to convert to markdown. |
| headers | object | No | Custom headers for the request (e.g., User-Agent, cookies). |

#### Return Value

Returns an object with the following structure:
<CodeGroup>

```typescript
{
request_id: string;
status: "queued" | "processing" | "completed" | "failed";
website_url: string;
result?: string | null;
error: string;
}
```python Python
// TODO
```

#### Error Handling
```javascript JavaScript
import { getMarkdownifyRequest } from 'scrapegraph-js';

The function includes built-in error handling for common scenarios:
const apiKey = 'your_api_key';
const requestId = 'ID_of_previous_request';

```javascript
try {
const request = getMarkdownifyRequest({
websiteUrl: "https://example.com/article"
});
const requestInfo = await getMarkdownifyRequest(apiKey, requestId);
console.log(requestInfo);
} catch (error) {
if (error.code === 'INVALID_URL') {
console.error('The provided URL is not valid');
} else if (error.code === 'MISSING_REQUIRED') {
console.error('Required parameters are missing');
} else {
console.error('An unexpected error occurred:', error);
}
console.error(error);
}
```

#### Advanced Examples

##### Using Custom Headers

```javascript
const request = getMarkdownifyRequest({
websiteUrl: "https://example.com/article",
headers: {
"User-Agent": "Custom User Agent",
"Accept-Language": "en-US,en;q=0.9",
"Cookie": "session=abc123; user=john",
"Authorization": "Bearer your-auth-token"
}
});
```

##### Handling Dynamic Content

For websites with dynamic content, you might need to adjust the request:

```javascript
const request = getMarkdownifyRequest({
websiteUrl: "https://example.com/dynamic-content",
headers: {
// Headers to handle dynamic content
"X-Requested-With": "XMLHttpRequest",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9",
// Add any required session cookies
"Cookie": "dynamicContent=enabled; sessionId=xyz789"
}
});
```

#### Best Practices

1. **URL Validation**
- Always validate URLs before making requests
- Ensure URLs use HTTPS when possible
- Handle URL encoding properly

```javascript
import { isValidUrl } from 'scrapegraph-js/utils';

const url = "https://example.com/article with spaces";
const encodedUrl = encodeURI(url);

if (isValidUrl(encodedUrl)) {
const request = getMarkdownifyRequest({ websiteUrl: encodedUrl });
}
```bash cURL
// TODO
```

2. **Header Management**
- Use appropriate User-Agent strings
- Include necessary cookies for authenticated content
- Set proper Accept headers

3. **Error Recovery**
- Implement retry logic for transient failures
- Cache successful responses when appropriate
- Log errors for debugging

```javascript
import { getMarkdownifyRequest, retry } from 'scrapegraph-js';

const makeRequest = retry(async () => {
const request = await getMarkdownifyRequest({
websiteUrl: "https://example.com/article"
});
return request;
}, {
retries: 3,
backoff: true
});
```
</CodeGroup>

4. **Performance Optimization**
- Batch requests when possible
- Use caching strategies
- Monitor API usage
#### Parameters

```javascript
import { cache } from 'scrapegraph-js/utils';
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| apiKey | string | Yes | The ScrapeGraph API Key. |
| requestId | string | Yes | The request ID associated with the output of a previous searchScraper request. |

const cachedRequest = cache(getMarkdownifyRequest, {
ttl: 3600, // Cache for 1 hour
maxSize: 100 // Cache up to 100 requests
});
```

### Async Support

Expand Down
Loading