This is my overhauled and improved ChatGPT API UI, forked from patrikzudel (PatrikZeros-ChatGPT-API-UI). Firstly, a HUGE thank you to Patrik for his fantastic work in the original version.
There were a number of improvements I wanted to make. This was initially done for only my personal use, but now I am offering it publicly.
- I refactored the codebase to be modularized, instead of being contained in one large file. This makes extending the code much easier.
- Because all data is browser-side, wiping cache will wipe the data. The user can now preserve their conversations by exporting their session file.
- The settings pane has a button to conduct an API check and let the user know if the API key is working or not.
- Instead of it being hardcoded to use a certain model, the settings pane now queries the API for a list of available models, and the user may select one. The UI will display the currently active model in the side pane.
- In the settings pane you can select a mode: GPT, GPT + Vision, or TTS. The list of available models dynamically filters.
- With a TTS model selected, the settings pane also allows a user to select a Voice model. Then the user can submit text to be converted into an MP3. The resulting message displays a play button and a download button. Note that these are ephemeral: they are only stored in the active session and will no longer work if you reload the page.
- With a Vision model selected, the user can upload one or more images to the API. The UI will dynamically update to expose the image upload button when this mode is selected. The AI will be able to see and tell you about the images.
- A copy button is now present on any message, allowing the user to copy it to their clipboard.
- When using a GPT model, the user can upload a PDF, which is parsed for text entirely browser-side. Then the user may ask the AI questions about the contents of the PDF.
- Just like on the official UI, the user may edit any message. This branches the conversation at that point.
- The conversation now allows the user to safely scroll up while text is streaming in. If the user scrolls back down to the bottom, the chat window snaps back into scrolling mode.
- The UI more closely resembles the official ChatGPT interface, with better icons, avatars, and expanding text areas.
- The API is supposed to send JSON packets for the app to render as incoming text, but sometimes it sends multiple JSON packets in a burst. When multiple JSON packets arrive at once, the original code would error out. Now, the app gracefully unpacks and separates the JSON packets so the text continues to flow.
- While the AI is talking to the user and data is streaming in, the UI won't allow the user to submit a new message. If the user wants to stop the API from talking, they can click the stop button and safely stop the incoming stream.
- In the background, the app will request the API to generate a title for the conversation after the first message is sent. This hidden prompt is now improved so the title is more meaningful and useful.
- The user can now click the edit button on a conversation title and give it a custom title.
The original version had special buttons for "send without history" and "summarize". I removed these as accounting for these edge cases made modularizing the code quite difficult. I also didn't find myself ever making use of them. To "send without history" you can simply start a new conversation. To "summarize" you can simply ask the AI to summarize the conversation. However, now that the code is robustly modularized, someone could add them back in if desired.
Just like the original version, the conversations are stored in the browser cache, and will be lost if the cache is wiped.
I don't necessarily plan to actively support requests, as I forked the original just to improve it for my own needs. Feel free to let me know about bugs, and I might get to it. Otherwise, feel free to fork this or the original and make further improvements.