Diffbot NL Processing
Diffbot Natural Language Processing¶
Bridge402 provides access to Diffbot's Natural Language Processing API through x402-payment-protected endpoints. Extract entities, sentiment, facts, and relationships from any freeform raw text using crypto payments.
Overview¶
The Natural Language API is a pre-trained classifier, named entity recognition model, sentence tokenizer, and sentiment analyzer rolled into a single service. It allows you to understand any piece of freeform raw text programmatically.
What it does: - Extract entities (e.g., people, organizations, products) and data about them (e.g., sentiment, relationships) from raw text - Analyze sentiment at both document and entity levels - Extract structured facts and relationships - Build knowledge graphs from text - Discover open-domain facts beyond predefined schemas
In layman's terms: Natural Language API allows you to understand any piece of freeform raw text programmatically.
API Endpoint¶
Natural Language Processing¶
POST /diffbot/nl
Process raw text documents using Diffbot Natural Language API. Extract entities, sentiment, facts, and relationships from any freeform text.
Query Parameters:
| Parameter | Type | Required | Description | Example |
|---|---|---|---|---|
fields |
string | No | Comma-separated list of fields to include | entities,sentiment,facts,records,sentences |
network |
string | No | Payment network preference | base or sol/solana |
Default Fields: If not specified, the API returns: entities, sentiment, facts, records, sentences
Request Body:
Headers:
| Header | Type | Required | Description |
|---|---|---|---|
X-PAYMENT |
string | Yes* | Base64-encoded x402 payment data |
Content-Type |
string | Yes | application/json |
*Required for access. If omitted, returns payment invoice (402 response).
Features & Terminology¶
Entity. Anything in the real world. Example: Apple Inc, Steve Jobs.
Entity Type. A class of an entity. Example: organization, person. The list of entity types we support can be found in the Diffbot documentation.
Fact. A fact defines a relationship between entities (Apple Inc; founder; Steve Jobs) or an entity and a literal (Apple Inc; number of employees; 137,000).
Property. A property defines the relationship type (founder, number of employees) of a fact. The list of properties we support can be found in the Diffbot schema documentation.
Open Fact. Unlike a regular fact, an open fact does not follow a pre-defined list of properties. An open fact's property is extracted directly from the text. This enables new properties to be discovered. NOTE: This feature is currently disabled as we work to improve its capabilities.
Sentiment of a document. This value represents the overall sentiment of the text. It ranges from -1.0 (very negative) to 1.0 (very positive). Sentiment around 0.0 is considered neutral.
Sentiment of an entity. This value represents the sentiment of the text towards an entity. Example: "I love Apple products, but the iMac Pro is too pricey." is positive towards Apple and negative towards the iMac Pro.
Salience. This value helps answer the question: "What is this text mainly about?". Salience of 1.0 means the entity is the main topic of the document, while salience of 0.0 means that the entity is unnecessary to understand the document.
Supported Languages¶
NLP feature support may vary with each language.
| Feature | Languages Supported |
|---|---|
| Sentiment | Over 100 languages. View the full list |
| Entity | English (en), French (fr), Spanish (es), Chinese (zh), German (de), Russian (ru), Japanese (ja), Dutch (nl), Polish (pl), Norwegian (no), Danish (da), Swedish (sv), Italian (it) |
| Salience | English (en), French (fr), Spanish (es), Chinese (zh), German (de), Russian (ru), Japanese (ja), Dutch (nl), Polish (pl), Norwegian (no), Danish (da), Swedish (sv), Italian (it) |
| All Others (Facts, Open Facts, etc.) | English (en) only |
Credit Usage & Limits¶
Credit Usage: - Each document consumes 1 credit up to 10,000 characters - Additional blocks of 10,000 characters consume 1 credit each
Limits: - Maximum of 100,000 characters per document - Maximum of 1,000,000 total characters per API request
Request Examples¶
Get Payment Invoice (Without Payment)¶
curl -X POST "https://bridge402.tech/diffbot/nl?network=sol" \
-H "Content-Type: application/json" \
-d '{
"documents": [
{
"text": "Sample text for invoice request"
}
]
}'
Response (402 Payment Required):
{
"x402Version": 1,
"error": "X-PAYMENT header is required",
"accepts": [
{
"scheme": "exact",
"network": "solana",
"maxAmountRequired": "10000",
"asset": "EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v",
"payTo": "BjxbJg48jQmoBLJnRunB1CMY5SZwvcUmnXCaWNeSXBei",
"resource": "https://bridge402.tech/diffbot/nl",
"description": "Diffbot Natural Language processing [Solana/USDC]",
"mimeType": "application/json",
"maxTimeoutSeconds": 120,
"extra": {
"product": "Bridge402 Diffbot — Natural Language Processing (Solana)",
"extractionType": "nl",
"feePayer": "2wKupLR9q6wXYppw8Gr2NvWxKBUqm4PPJKkQfoxHDBg4"
}
}
],
"extractionType": "nl"
}
Process Documents with Payment¶
curl -X POST "https://bridge402.tech/diffbot/nl?fields=entities,sentiment,facts&network=sol" \
-H "Content-Type: application/json" \
-H "X-PAYMENT: <base64-encoded-x402-payment>" \
-d '{
"documents": [
{
"text": "Apple Inc was founded by Steve Jobs in 1976. The company revolutionized personal computing with the introduction of the Macintosh."
}
]
}'
Response (200 Success):
{
"extractionType": "nl",
"data": [
{
"errors": [],
"entities": [
{
"name": "Apple Inc",
"diffbotUri": "http://diffbot.com/entity/Apple_Inc",
"confidence": 0.95,
"salience": 0.9,
"sentiment": 0.0,
"allUris": ["http://diffbot.com/entity/Apple_Inc"],
"allTypes": [
{
"name": "Organization",
"diffbotUri": "http://diffbot.com/entity/Organization"
}
],
"mentions": [
{
"text": "Apple Inc",
"beginOffset": 0,
"endOffset": 8,
"isPronoun": false,
"confidence": 0.95
}
]
},
{
"name": "Steve Jobs",
"diffbotUri": "http://diffbot.com/entity/Steve_Jobs",
"confidence": 0.92,
"salience": 0.7,
"sentiment": 0.0,
"allTypes": [
{
"name": "Person",
"diffbotUri": "http://diffbot.com/entity/Person"
}
]
}
],
"sentiment": 0.2,
"facts": [
{
"humanReadable": "Apple Inc was founded by Steve Jobs",
"entity": { "name": "Apple Inc" },
"property": { "name": "founder" },
"value": { "name": "Steve Jobs" },
"confidence": 0.9,
"evidence": [
{
"passage": "Apple Inc was founded by Steve Jobs in 1976."
}
]
}
],
"records": [],
"categories": {},
"sentences": [
{
"beginOffset": 0,
"endOffset": 50
}
],
"language": "en",
"summary": "Apple Inc was founded by Steve Jobs in 1976 and revolutionized personal computing."
}
],
"payment": {
"verified": true,
"settled": true,
"txHash": "5xK...",
"network": "solana"
},
"metadata": {
"provider": "Diffbot",
"endpoint": "nl",
"timestamp": 1703123456.789
}
}
Response Format¶
The Natural Language API returns an array of processed documents. Each document in the data array contains:
Entities¶
{
"entities": [
{
"name": "Entity Name",
"diffbotUri": "http://diffbot.com/entity/Entity_Name",
"confidence": 0.95,
"salience": 0.8,
"sentiment": 0.6,
"allUris": ["http://diffbot.com/entity/Entity_Name"],
"allTypes": [
{
"name": "Organization",
"diffbotUri": "http://diffbot.com/entity/Organization"
}
],
"mentions": [
{
"text": "Entity",
"beginOffset": 0,
"endOffset": 6,
"isPronoun": false,
"confidence": 0.95
}
],
"location": {
"latitude": 37.7749,
"longitude": -122.4194
}
}
]
}
Sentiment¶
Facts¶
{
"facts": [
{
"humanReadable": "Entity relationship description",
"entity": { "name": "Entity Name" },
"property": { "name": "propertyName" },
"value": { "name": "Value Name" },
"confidence": 0.9,
"evidence": [
{
"passage": "Text passage where fact was found"
}
],
"entityMentions": [...],
"valueMentions": [...]
}
]
}
Records¶
{
"records": [
{
// Entities with attributes extracted according to KG schema
// See https://docs.diffbot.com/docs/en/kg-ont-diffbotentity
}
]
}
Sentences¶
Integration Examples¶
Python Example¶
import asyncio
import httpx
import json
async def process_natural_language(documents, payment_data, fields=None):
"""Process documents using Diffbot Natural Language API with x402 payment"""
async with httpx.AsyncClient() as client:
params = {"network": "sol"}
if fields:
params["fields"] = ",".join(fields)
response = await client.post(
"https://bridge402.tech/diffbot/nl",
params=params,
headers={
"X-PAYMENT": payment_data,
"Content-Type": "application/json"
},
json={"documents": documents}
)
if response.status_code == 200:
data = response.json()
return data
elif response.status_code == 402:
# Payment required - get invoice
invoice = response.json()
print(f"Payment required: {invoice['accepts'][0]['maxAmountRequired']} atomic units")
return invoice
else:
raise Exception(f"Request failed: {response.status_code} - {response.text}")
# Usage
documents = [
{"text": "Bitcoin reached a new all-time high today. Ethereum network upgrade scheduled for next month."}
]
result = await process_natural_language(
documents,
"<your-x402-payment>",
fields=["entities", "sentiment", "facts"]
)
if result.get("data"):
for doc_result in result["data"]:
print(f"Sentiment: {doc_result.get('sentiment')}")
print(f"Entities: {len(doc_result.get('entities', []))}")
print(f"Facts: {len(doc_result.get('facts', []))}")
JavaScript/Node.js Example¶
import { request } from 'undici';
async function processNaturalLanguage(documents, paymentData, fields = null) {
let url = 'https://bridge402.tech/diffbot/nl?network=sol';
if (fields && Array.isArray(fields)) {
url += `&fields=${fields.join(',')}`;
}
const res = await request(url, {
method: 'POST',
headers: {
'X-PAYMENT': paymentData,
'Content-Type': 'application/json'
},
body: JSON.stringify({ documents })
});
const data = await res.body.json();
if (data.data && Array.isArray(data.data)) {
return data.data.map(doc => ({
sentiment: doc.sentiment,
entities: doc.entities || [],
facts: doc.facts || [],
language: doc.language
}));
}
return data;
}
// Usage
const documents = [
{
text: 'Bitcoin reached a new all-time high today. Ethereum network upgrade scheduled for next month.'
}
];
const results = await processNaturalLanguage(
documents,
'<your-x402-payment>',
['entities', 'sentiment', 'facts']
);
results.forEach((result, index) => {
console.log(`Document ${index + 1}:`);
console.log(` Sentiment: ${result.sentiment}`);
console.log(` Entities: ${result.entities.length}`);
console.log(` Facts: ${result.facts.length}`);
});
Using the Bridge402 SDK¶
import { DiffbotClient } from '@bridge402/sdk';
import { Keypair } from '@solana/web3.js';
// Load your wallet
const wallet = Keypair.fromSecretKey(/* your keypair */);
// Create Diffbot client
const client = new DiffbotClient({
wallet: wallet,
baseUrl: 'https://bridge402.tech',
network: 'sol'
});
// Process documents with Natural Language API
const documents = [
{
text: 'Bitcoin reached a new all-time high today. Ethereum network upgrade scheduled for next month.'
}
];
const result = await client.extractNaturalLanguage(
documents,
['entities', 'sentiment', 'facts'] // Optional fields
);
// Access results
result.data.forEach((doc, index) => {
console.log(`Document ${index + 1}:`);
console.log(` Sentiment: ${doc.sentiment}`);
console.log(` Entities: ${doc.entities.length}`);
console.log(` Facts: ${doc.facts.length}`);
// Access specific entities
doc.entities.forEach(entity => {
console.log(` - ${entity.name} (${entity.allTypes[0]?.name})`);
});
});
Use Cases¶
Sentiment Analysis¶
Analyze sentiment of user reviews, social media posts, or customer feedback:
documents = [
{"text": "I love this product! It's amazing and works perfectly."},
{"text": "Terrible quality. Would not recommend to anyone."}
]
result = await process_natural_language(documents, payment_data, ["sentiment"])
for doc in result["data"]:
sentiment = doc["sentiment"]
if sentiment > 0.5:
print("Positive review")
elif sentiment < -0.5:
print("Negative review")
else:
print("Neutral review")
Entity Extraction¶
Extract and identify entities from news articles or documents:
const documents = [
{
text: "Apple Inc announced a new iPhone model. CEO Tim Cook presented the device at the company's headquarters in Cupertino."
}
];
const result = await processNaturalLanguage(documents, paymentData, ['entities']);
result[0].entities.forEach(entity => {
console.log(`${entity.name} - ${entity.allTypes[0]?.name}`);
console.log(` Salience: ${entity.salience}`);
console.log(` Sentiment: ${entity.sentiment}`);
});
Fact Extraction¶
Extract structured facts and relationships:
documents = [
{
"text": "Apple Inc was founded by Steve Jobs in 1976. The company is headquartered in Cupertino, California."
}
]
result = await process_natural_language(documents, payment_data, ["facts"])
for fact in result["data"][0]["facts"]:
print(fact["humanReadable"])
print(f" Entity: {fact['entity']['name']}")
print(f" Property: {fact['property']['name']}")
print(f" Value: {fact['value']['name']}")
Multi-Document Processing¶
Process multiple documents in a single request:
const documents = [
{ text: "First document about Bitcoin..." },
{ text: "Second document about Ethereum..." },
{ text: "Third document about Solana..." }
];
const result = await processNaturalLanguage(
documents,
paymentData,
['entities', 'sentiment', 'facts']
);
// Process each document result
result.forEach((docResult, index) => {
console.log(`Document ${index + 1}:`);
console.log(` Sentiment: ${docResult.sentiment}`);
console.log(` Entities: ${docResult.entities.length}`);
});
Error Handling¶
Common Errors¶
400 Bad Request
402 Payment Required
500 Internal Server Error - Diffbot API may be unavailable - Invalid document format - Retry the request
502 Bad Gateway - Upstream Diffbot API error - Verify Diffbot API key is configured on the server
Best Practices¶
- Batch Processing: Process multiple documents in a single request to reduce API calls
- Field Selection: Only request fields you need to reduce response size and processing time
- Character Limits: Stay within 100,000 characters per document and 1,000,000 total per request
- Error Handling: Always handle 402 responses to get payment requirements
- Network Selection: Choose network based on your wallet capabilities (Base or Solana)
- Caching: Cache results for identical text inputs to avoid redundant processing
Pricing¶
- Cost: $0.01 USDC per request (10,000 atomic units)
- Payment Networks: Base or Solana (USDC)
- No Subscription Required: Pay-per-use model perfect for AI agents and intermittent access
- Credit Usage: 1 credit per 10,000 characters (additional blocks consume 1 credit each)
Support¶
For questions about Natural Language Processing or integration help, refer to: - Payment Integration Guide - Diffbot Extraction - For URL-based content extraction - Contact the Bridge402 development team