Skip to content

Commit 8f4a6b7

Browse files
authored
feat: Vision LLM and Device Camera API Example (#1352)
1 parent 2672124 commit 8f4a6b7

File tree

7 files changed

+394
-78
lines changed

7 files changed

+394
-78
lines changed

.env.example

+1
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ STRIPE_PKEY=pk_test_6pRNASCoBOKtIshFeQd4XMUh
7676

7777
TOGETHERAI_API_KEY=sample-api-key
7878
TOGETHERAI_MODEL=meta-llama/Llama-3.3-70B-Instruct-Turbo-Free
79+
TOGETHERAI_VISION_MODEL=meta-llama/Llama-Vision-Free
7980

8081
TRAKT_ID=trakt-client-id
8182
TRAKT_SECRET=trackt-client-secret

README.md

+4-3
Original file line numberDiff line numberDiff line change
@@ -82,9 +82,10 @@ I also tried to make it as **generic** and **reusable** as possible to cover mos
8282
- Delete Account
8383
- Contact Form (powered by SMTP via Sendgrid, Mailgun, AWS SES, etc.)
8484
- File upload
85+
- Device camera
8586
- **API Examples**
8687

87-
- **AI:** OpenAI Moderation, Together AI foundational model LLMs (aka Deepseek, Llama, Mistral, etc.)
88+
- **AI:** OpenAI Moderation, LLAMA instruct, LLAMA vision (via Together AI serverless foundational models - Deepseek, Llama, Mistral, etc.)
8889
- **Backoffice:** Lob (USPS Mail), Paypal, Quickbooks, Stripe, Twilio (text messaging)
8990
- **Data, Media & Entertainment:** Alpha Vantage (stocks and finance info) with ChartJS, Github, Foursquare, Last.fm, New York Times, Trakt.tv (movies/TV), Twitch, Tumblr (OAuth 1.0a example), Web Scraping
9091
- **Maps and Location:** Google Maps, HERE Maps
@@ -108,7 +109,7 @@ I also tried to make it as **generic** and **reusable** as possible to cover mos
108109
- Hosted: No need to install, see the MongoDB Atlas section
109110

110111
- [Node.js 22.12+](http://nodejs.org)
111-
- Highly recommended: Use/Upgrade your NodeJS to the latest NodeJS 22 LTS version.
112+
- Highly recommended: Use/Upgrade your Node.js to the latest Node.js 22 LTS version.
112113
- Command Line Tools
113114
- <img src="https://upload.wikimedia.org/wikipedia/commons/1/1b/Apple_logo_grey.svg" height="17">&nbsp;**Mac OS X:** [Xcode](https://itunes.apple.com/us/app/xcode/id497799835?mt=12) (or **OS X 10.9+**: `xcode-select --install`)
114115
- <img src="https://upload.wikimedia.org/wikipedia/commons/8/87/Windows_logo_-_2021.svg" height="17">&nbsp;**Windows:** [Visual Studio Code](https://code.visualstudio.com) + [Windows Subsystem for Linux - Ubuntu](https://learn.microsoft.com/en-us/windows/wsl/install) OR [Visual Studio](https://www.visualstudio.com/products/visual-studio-community-vs)
@@ -1023,7 +1024,7 @@ You now have a choice - to include your JavaScript code in Pug templates or have
10231024
10241025
But it's also understandable if you want to take the easier road. Most of the time you don't even care about performance during hackathons, you just want to _"get shit done"_ before the time runs out. Well, either way, use whichever approach makes more sense to you. At the end of the day, it's **what** you build that matters, not **how** you build it.
10251026
1026-
If you want to stick all your JavaScript inside templates, then in `layout.pug` - your main template file, add this to `head` block.
1027+
If you want to stick all your JavaScript inside templates, then in `layout.pug` - your main template file, add this to the `head` block.
10271028
10281029
```pug
10291030
script(src='/socket.io/socket.io.js')

app.js

+6-2
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ const secureTransfer = process.env.BASE_URL.startsWith('https');
2828
* Rate limiting configuration
2929
* This is a basic rate limiting configuration. You may want to adjust the settings
3030
* based on your application's needs and the expected traffic patterns.
31-
* Alos, consider adding a proxy such as cloudflare for production.
31+
* Also, consider adding a proxy such as cloudflare for production.
3232
*/
3333
// Global Rate Limiter Config
3434
const limiter = rateLimit({
@@ -124,8 +124,10 @@ app.use(passport.initialize());
124124
app.use(passport.session());
125125
app.use(flash());
126126
app.use((req, res, next) => {
127-
if (req.path === '/api/upload') {
127+
if (req.path === '/api/upload' || req.path === '/api/togetherai-camera') {
128128
// Multer multipart/form-data handling needs to occur before the Lusca CSRF check.
129+
// WARN: Any path that is not protected by CSRF here should have lusca.csrf() chained
130+
// in their route handler.
129131
next();
130132
} else {
131133
lusca.csrf()(req, res, next);
@@ -233,6 +235,8 @@ app.get('/api/openai-moderation', apiController.getOpenAIModeration);
233235
app.post('/api/openai-moderation', apiController.postOpenAIModeration);
234236
app.get('/api/togetherai-classifier', apiController.getTogetherAIClassifier);
235237
app.post('/api/togetherai-classifier', apiController.postTogetherAIClassifier);
238+
app.get('/api/togetherai-camera', lusca({ csrf: true }), apiController.getTogetherAICamera);
239+
app.post('/api/togetherai-camera', strictLimiter, apiController.imageUploadMiddleware, lusca({ csrf: true }), apiController.postTogetherAICamera);
236240

237241
/**
238242
* OAuth authentication failure handler (common for all providers)

controllers/api.js

+184-69
Original file line numberDiff line numberDiff line change
@@ -1495,6 +1495,178 @@ exports.postOpenAIModeration = async (req, res) => {
14951495
});
14961496
};
14971497

1498+
/**
1499+
* Helper functions and constants for Together AI API Example
1500+
* We are using LLMs to classify text or analyze a picture taken by the user's camera.
1501+
*/
1502+
1503+
// Shared Together AI API caller
1504+
const callTogetherAiApi = async (apiRequestBody, apiKey) => {
1505+
const response = await fetch('https://api.together.xyz/v1/chat/completions', {
1506+
method: 'POST',
1507+
headers: {
1508+
'Content-Type': 'application/json',
1509+
Authorization: `Bearer ${apiKey}`,
1510+
},
1511+
body: JSON.stringify(apiRequestBody),
1512+
});
1513+
if (!response.ok) {
1514+
const errData = await response.json().catch(() => ({}));
1515+
console.error('Together AI API Error Response:', errData);
1516+
const errorMessage = errData.error && errData.error.message ? errData.error.message : `API Error: ${response.status}`;
1517+
throw new Error(errorMessage);
1518+
}
1519+
return response.json();
1520+
};
1521+
1522+
// Vision-specific functions
1523+
const createVisionLLMRequestBody = (dataUrl, model) => ({
1524+
model,
1525+
messages: [
1526+
{
1527+
role: 'user',
1528+
content: [
1529+
{
1530+
type: 'text',
1531+
text: 'What is in this image?',
1532+
},
1533+
{
1534+
type: 'image_url',
1535+
image_url: {
1536+
url: dataUrl,
1537+
},
1538+
},
1539+
],
1540+
},
1541+
],
1542+
});
1543+
1544+
const extractVisionAnalysis = (data) => {
1545+
if (data.choices && Array.isArray(data.choices) && data.choices.length > 0 && data.choices[0].message && data.choices[0].message.content) {
1546+
return data.choices[0].message.content;
1547+
}
1548+
return 'No vision analysis available';
1549+
};
1550+
1551+
// Classifier-specific functions
1552+
const createClassifierLLMRequestBody = (inputText, model, systemPrompt) => ({
1553+
model,
1554+
messages: [
1555+
{ role: 'system', content: systemPrompt },
1556+
{ role: 'user', content: inputText },
1557+
],
1558+
temperature: 0,
1559+
max_tokens: 64,
1560+
});
1561+
1562+
const extractClassifierResponse = (content) => {
1563+
let department = null;
1564+
if (content) {
1565+
try {
1566+
// Try to extract JSON from the response
1567+
const jsonStringMatch = content.match(/{.*}/s);
1568+
if (jsonStringMatch) {
1569+
const parsed = JSON.parse(jsonStringMatch[0].replace(/'/g, '"'));
1570+
department = parsed.department;
1571+
}
1572+
} catch (err) {
1573+
console.log('Failed to parse JSON from TogetherAI API response:', err);
1574+
// fallback: try to extract department manually
1575+
const match = content.match(/"department"\s*:\s*"([^"]+)"/);
1576+
if (match) {
1577+
[, department] = match;
1578+
}
1579+
}
1580+
}
1581+
return department || 'Unknown';
1582+
};
1583+
1584+
// System prompt for the classifier
1585+
// This is the system prompt that instructs the LLM on how to classify the customer message
1586+
// into the appropriate department.
1587+
const messageClassifierSystemPrompt = `You are a customer service classifier for an e-commerce platform. Your role is to identify the primary issue described by the customer and return the result in JSON format. Carefully analyze the customer's message and select one of the following departments as the classification result:
1588+
1589+
Order Tracking and Status
1590+
Returns and Refunds
1591+
Payments and Billing Issues
1592+
Account Management
1593+
Product Inquiries
1594+
Technical Support
1595+
Shipping and Delivery Issues
1596+
Promotions and Discounts
1597+
Marketplace Seller Support
1598+
Feedback and Complaints
1599+
1600+
Provide the output in this JSON structure:
1601+
1602+
{
1603+
"department": "<selected_department>"
1604+
}
1605+
Replace <selected_department> with the name of the most relevant department from the list above. If the inquiry spans multiple categories, choose the department that is most likely to address the customer's issue promptly and effectively.`;
1606+
1607+
// Image Uploade middleware for Camera uploads
1608+
const createImageUploader = () => {
1609+
const memoryStorage = multer.memoryStorage();
1610+
return multer({
1611+
storage: memoryStorage,
1612+
limits: { fileSize: 10 * 1024 * 1024 }, // 10MB limit
1613+
}).single('image');
1614+
};
1615+
1616+
exports.imageUploadMiddleware = (req, res, next) => {
1617+
const uploadToMemory = createImageUploader();
1618+
uploadToMemory(req, res, (err) => {
1619+
if (err) {
1620+
console.error('Upload error:', err);
1621+
return res.status(500).json({ error: err.message });
1622+
}
1623+
next();
1624+
});
1625+
};
1626+
1627+
const createImageDataUrl = (file) => {
1628+
const base64Image = file.buffer.toString('base64');
1629+
return `data:${file.mimetype};base64,${base64Image}`;
1630+
};
1631+
1632+
/**
1633+
* GET /api/togetherai-camera
1634+
* Together AI Camera Analysis Example
1635+
*/
1636+
exports.getTogetherAICamera = (req, res) => {
1637+
res.render('api/togetherai-camera', {
1638+
title: 'Together.ai Camera Analysis',
1639+
togetherAiModel: process.env.TOGETHERAI_VISION_MODEL,
1640+
});
1641+
};
1642+
1643+
/**
1644+
* POST /api/togetherai-camera
1645+
* Analyze image using Together AI Vision
1646+
*/
1647+
exports.postTogetherAICamera = async (req, res) => {
1648+
if (!req.file) {
1649+
return res.status(400).json({ error: 'No image provided' });
1650+
}
1651+
try {
1652+
const togetherAiKey = process.env.TOGETHERAI_API_KEY;
1653+
const togetherAiModel = process.env.TOGETHERAI_VISION_MODEL;
1654+
if (!togetherAiKey) {
1655+
return res.status(500).json({ error: 'TogetherAI API key is not set' });
1656+
}
1657+
const dataUrl = createImageDataUrl(req.file);
1658+
const apiRequestBody = createVisionLLMRequestBody(dataUrl, togetherAiModel);
1659+
// console.log('Making Vision API request to Together AI...');
1660+
const data = await callTogetherAiApi(apiRequestBody, togetherAiKey);
1661+
const analysis = extractVisionAnalysis(data);
1662+
// console.log('Vision analysis completed:', analysis);
1663+
res.json({ analysis });
1664+
} catch (error) {
1665+
console.error('Error analyzing image:', error);
1666+
res.status(500).json({ error: `Error analyzing image: ${error.message}` });
1667+
}
1668+
};
1669+
14981670
/**
14991671
* GET /api/togetherai-classifier
15001672
* Together AI / LLM API Example.
@@ -1503,6 +1675,7 @@ exports.getTogetherAIClassifier = (req, res) => {
15031675
res.render('api/togetherai-classifier', {
15041676
title: 'Together.ai/LLM Department Classifier',
15051677
result: null,
1678+
togetherAiModel: process.env.TOGETHERAI_MODEL,
15061679
error: null,
15071680
input: '',
15081681
});
@@ -1522,82 +1695,24 @@ exports.postTogetherAIClassifier = async (req, res) => {
15221695
const inputText = (req.body.inputText || '').slice(0, 300);
15231696
let result = null;
15241697
let error = null;
1525-
15261698
if (!togetherAiKey) {
15271699
error = 'TogetherAI API key is not set in environment variables.';
15281700
} else if (!togetherAiModel) {
15291701
error = 'TogetherAI model is not set in environment variables.';
15301702
} else if (!inputText.trim()) {
1531-
error = 'Please enter a message to classify.';
1703+
error = 'Please enter the customer message to classify.';
15321704
} else {
15331705
try {
1534-
const systemPrompt = `You are a customer service classifier for an e-commerce platform. Your role is to identify the primary issue described by the customer and return the result in JSON format. Carefully analyze the customer's message and select one of the following departments as the classification result:
1535-
1536-
Order Tracking and Status
1537-
Returns and Refunds
1538-
Payments and Billing Issues
1539-
Account Management
1540-
Product Inquiries
1541-
Technical Support
1542-
Shipping and Delivery Issues
1543-
Promotions and Discounts
1544-
Marketplace Seller Support
1545-
Feedback and Complaints
1546-
1547-
Provide the output in this JSON structure:
1548-
1549-
{
1550-
"department": "<selected_department>"
1551-
}
1552-
Replace <selected_department> with the name of the most relevant department from the list above. If the inquiry spans multiple categories, choose the department that is most likely to address the customer's issue promptly and effectively.`;
1553-
1554-
const response = await fetch('https://api.together.xyz/v1/chat/completions', {
1555-
method: 'POST',
1556-
headers: {
1557-
'Content-Type': 'application/json',
1558-
Authorization: `Bearer ${togetherAiKey}`,
1559-
},
1560-
body: JSON.stringify({
1561-
model: togetherAiModel,
1562-
messages: [
1563-
{ role: 'system', content: systemPrompt },
1564-
{ role: 'user', content: inputText },
1565-
],
1566-
temperature: 0,
1567-
max_tokens: 64,
1568-
}),
1569-
});
1570-
1571-
if (!response.ok) {
1572-
const errData = await response.json().catch(() => ({}));
1573-
error = errData.error && errData.error.message ? errData.error.message : `API Error: ${response.status}`;
1574-
} else {
1575-
const data = await response.json();
1576-
const content = data.choices && data.choices[0] && data.choices[0].message && data.choices[0].message.content;
1577-
let department = null;
1578-
if (content) {
1579-
try {
1580-
// Try to extract JSON from the response
1581-
const jsonStringMatch = content.match(/{.*}/s);
1582-
if (jsonStringMatch) {
1583-
const parsed = JSON.parse(jsonStringMatch[0].replace(/'/g, '"'));
1584-
department = parsed.department;
1585-
}
1586-
} catch (err) {
1587-
console.log('Failed to parse JSON from TogetherAI API response:', err);
1588-
// fallback: try to extract department manually
1589-
const match = content.match(/"department"\s*:\s*"([^"]+)"/);
1590-
if (match) {
1591-
[, department] = match;
1592-
}
1593-
}
1594-
}
1595-
result = {
1596-
department: department || 'Unknown',
1597-
raw: content,
1598-
systemPrompt, // Send the sysetemPrompt to the front-end for this demo, not actual production applications.
1599-
};
1600-
}
1706+
const systemPrompt = messageClassifierSystemPrompt; // Your existing system prompt here
1707+
const apiRequestBody = createClassifierLLMRequestBody(inputText, togetherAiModel, systemPrompt);
1708+
const data = await callTogetherAiApi(apiRequestBody, togetherAiKey);
1709+
const content = data.choices && data.choices[0] && data.choices[0].message && data.choices[0].message.content;
1710+
const department = extractClassifierResponse(content);
1711+
result = {
1712+
department,
1713+
raw: content,
1714+
systemPrompt,
1715+
};
16011716
} catch (err) {
16021717
console.log('TogetherAI Classifier API Error:', err);
16031718
error = 'Failed to call TogetherAI API.';

views/api/index.pug

+8-2
Original file line numberDiff line numberDiff line change
@@ -141,5 +141,11 @@ block content
141141
a(href='/api/togetherai-classifier', style='color: #000')
142142
.card.mb-3(style='background-color: rgb(128, 181, 255)')
143143
.card-body
144-
img(src='https://i.imgur.com/dOCkJxT.png', height=40, style='padding: 0px 10px 0px 0px')
145-
| Together AI - one-shot LLM
144+
img(src='https://i.imgur.com/dOCkJxT.png', height=40, width=100, style='padding-right: 5px; object-fit: contain')
145+
| Llama Instruct
146+
.col-md-4
147+
a(href='/api/togetherai-camera', style='color: #000')
148+
.card.mb-3(style='background-color: rgb(128, 181, 255)')
149+
.card-body
150+
img(src='https://i.imgur.com/dOCkJxT.png', height=40, width=100, style='padding-right: 5px; object-fit: contain')
151+
| Llama Vision + Camera

0 commit comments

Comments
 (0)