-
Notifications
You must be signed in to change notification settings - Fork 201
Multimodality: Upload PDFs and Images #135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
feat: support drag and drop file upload
…rd instead of pdf parsing
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
…s with no text message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few comments, lmk when resolved
}) => { | ||
if (!blocks.length) return null; | ||
return ( | ||
<div className={`flex flex-wrap gap-2 p-3.5 pb-0 ${className}`}> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to wrap with cn
{contentString && | ||
contentString !== "Other" && | ||
contentString !== "Multimodal message" ? ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can drop in favor of just
{contentString && | |
contentString !== "Other" && | |
contentString !== "Multimodal message" ? ( | |
{contentString ? ( |
now that getContentString
won't return those strings
if (size === "lg") imgClass = "rounded-md object-cover h-24 w-24 text-xl"; | ||
return ( | ||
<div | ||
className={`relative inline-block${className ? ` ${className}` : ""}`} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to wrap with cn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sure this is fixed everywhere
<div | ||
className={`relative inline-block${className ? ` ${className}` : ""}`} | ||
> | ||
<img |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be using NextImage
from next/image
here instead?
src/hooks/use-file-upload.tsx
Outdated
export const SUPPORTED_IMAGE_TYPES = [ | ||
"image/jpeg", | ||
"image/png", | ||
"image/gif", | ||
"image/webp", | ||
]; | ||
export const SUPPORTED_FILE_TYPES = [ | ||
...SUPPORTED_IMAGE_TYPES, | ||
"application/pdf", | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why two arrays and not one?
src/components/thread/index.tsx
Outdated
const setThreadId = (id: string | null) => { | ||
_setThreadId(id); | ||
|
||
// close artifact and reset artifact context | ||
closeArtifact(); | ||
setArtifactContext({}); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont think this should be dropped
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its called again later - thats my fault for bad Diff going to do as suggested below
src/components/thread/index.tsx
Outdated
}; | ||
|
||
const toolMessages = ensureToolCallsHaveResponses(stream.messages); | ||
|
||
const context = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont think this should be dropped
src/components/thread/index.tsx
Outdated
}; | ||
|
||
// Restore handleRegenerate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vibe coding comments shouldnt be committed
src/components/thread/index.tsx
Outdated
closeArtifact(); | ||
setArtifactContext({}); | ||
}; | ||
const threadId = _threadId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just set it to threadId
originally where you define useQueryState
instead of re-assigning here
src/components/thread/index.tsx
Outdated
<motion.div | ||
className={cn( | ||
"grid w-full grid-cols-[1fr_0fr] transition-all duration-500", | ||
artifactOpen && "grid-cols-[3fr_2fr]", | ||
"relative flex min-w-0 flex-1 flex-col overflow-hidden", | ||
!chatStarted && "grid-rows-[1fr]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im still confused why there's so many JSX changes here? It might be easier to just copy/paste whatever we have in main for the JSX, then add your code for the files instead of trying to fix this merge conflict
PR Summary: Multimodal File Uploads (Images & PDFs) Integration
Overview
This PR introduces robust support for multimodal file uploads—specifically images and PDFs—across the chat UI and supporting utility functions. The changes enable users to attach images and PDF files to their chat messages, with proper encoding, preview, and message formatting for downstream processing (e.g., OpenAI-compatible formats).
Key Changes
1.
src/lib/multimodal-utils.ts
fileToImageBlock(file: File)
: Converts an image file to a typedBase64ContentBlock
for image uploads.fileToPDFBlock(file: File)
: Converts a PDF file to a typedBase64ContentBlock
for PDF uploads.fileToBase64(file: File)
: Helper to encode any file as a base64 string (strips data URI prefix).toOpenAIPDFBlock(block: Base64ContentBlock)
: Converts a PDF block to OpenAI-compatible file format.toOpenAIImageBlock(block: Base64ContentBlock)
: Converts a base64 image block to OpenAI-compatible image format.2.
src/components/thread/index.tsx
Implementation Notes
lib/multimodal-utils.ts
for reusability and testability.Closes: #multimodal-upload, #image-upload, #pdf-upload