New integrated mode for GPT-4 in ChatGPT+
They switched me to the new integrated mode. This is supposed to have all kinds of upgrades, in particular it is possible to read and write images in one session.
I still don't know if it is possible to competently edit images via this workflow (I'll try, and I'll read what other people say, but it might be difficult without a more image-oriented input system). Nevertheless, just like in dmm.dreamwidth.org/76698.html I read my avatar icon I am using here and talk to the new GPT-4 about it, and then I ask it to produce images based on that, see the comments for conversation and images.
I still don't know if it is possible to competently edit images via this workflow (I'll try, and I'll read what other people say, but it might be difficult without a more image-oriented input system). Nevertheless, just like in dmm.dreamwidth.org/76698.html I read my avatar icon I am using here and talk to the new GPT-4 about it, and then I ask it to produce images based on that, see the comments for conversation and images.
no subject
So I am just copy-pasting it here.
ME: uploading my dreamwidth avatar and asking: "What is in this image?"
GPT-4:
The image appears to be a schematic representation of a physics concept. Specifically, it looks like a diagram of an optical system involving ray tracing through lenses or a lens system.
Here’s a breakdown of what the diagram might represent:
* There are two sets of parallel lines that could represent two lenses or a system with a lens and a mirror.
* The lines with arrows might depict light rays.
* The light rays are converging or diverging, indicating the focal points of the lenses or mirrors.
* The "f" symbols represent the focal length, the distance between the center of a lens or mirror and its focal point.
* The curved line with two arrows might represent a lens that is focusing the parallel light rays to a point, or it could indicate a curved mirror doing the same.
This kind of diagram is used in optics to illustrate how light behaves as it passes through or reflects off optical components. It’s a fundamental tool in designing and understanding optical systems like cameras, telescopes, microscopes, and eyeglasses.
ME: I like your interpretation! Actually, this is a neural machine, where single neurons are powerful enough to transform flexible tree-shaped tensors. Can you make an image creatively inspired by this?
no subject
no subject
(That's actually quite good.)
ME: Great, this is an awesome image! Now, can you try to remix it with the image I uploaded?
no subject
no subject
(Mmmmm... may be... the question whether one can have good control over image creation process remains open.)
no subject
Great! What were the prompts you have sent to DALL-E for these two images?
GPT-4:
For the first image, the prompt sent to DALL-E was:
"Visualize a futuristic neural machine, where single neurons are depicted as intricate, metallic structures with branching, tree-like cables or conduits representing flexible tree-shaped tensors. These neurons are interconnected with glowing pathways of data flow, highlighting the transformation and processing power within a complex network. The entire scene should have a luminous, high-tech feel, with a clear sense of advanced computation and artificial intelligence at work. The background should be dark to contrast the bright, energy-filled connections of the neural network."
For the second image, the prompt was:
"Combine the concept of a neural machine with the schematic design from the user's uploaded image. The image should depict a fusion of the intricate, metallic tree-like structures representing neurons with the optical lens-like elements from the schematic. The neurons should have branching, flexible tensors interwoven with the schematic's lens and ray design, creating a harmonious blend of organic and structured elements. The entire composition should be illuminated with glowing pathways of data flow, set against a dark background to emphasize the complexity and elegance of this neural optical machine."
These prompts were designed to create a visual representation that merges the abstract concept of a neural network with the schematic structure you provided.
no subject
In any case, this is a massive, drastic upgrade, expect some things to get broken, some old workflow inventions to no longer work well and to be in the need of reinventing, etc.
no subject
https://clara.nytud.hu/~kk120/2017/bukatin/bukatin-4.png
no subject
I started a new session, loaded my avatar image, and make the following straightforward request:
*****
Hi, I'd like to take this image and add a red ball to its upper right corner...
*****
This resulted in a stunningly beautiful image, which used the motifs from my request, but is completely different from what I asked for (so I am not hopeful about the ability to control image generation in this fashion):
no subject
no subject
Interesting, thanks! What was the prompt you have sent to DALL-E?
GPT-4:
The prompt I used for DALL-E was as follows:
"An abstract scientific diagram showing a process with arrows and schematic symbols. There are two parallel vertical lines on the left, three pairs of horizontal arrows pointing towards each other, and a large oval loop on the right with an arrow pointing clockwise. Add a small red ball in the upper right corner of the image. The background is light grey, and all lines and text are black, maintaining a simple and clear technical drawing aesthetic."
no subject
no subject
ME: Generate an image with the following description: An abstract scientific diagram showing a process with arrows and schematic symbols. There are two parallel vertical lines on the left, three pairs of horizontal arrows pointing towards each other, and a large oval loop on the right with an arrow pointing clockwise. Add a small red ball in the upper right corner of the image. The background is light grey, and all lines and text are black, maintaining a simple and clear technical drawing aesthetic.
I generated an image with the prompt: 'An abstract scientific diagram showing a process with arrows and schematic symbols. Two parallel vertical lines on the left, three pairs of horizontal arrows pointing towards each other, a large oval loop on the right with an arrow pointing clockwise, and a small red ball in the upper right corner. Light grey background, black lines and text.'.
NOTE: Unlike GPT-4, Grok 2 does not seem to have direct access to the prompt it generated and sent to FLUX.1, it's a bit confused about all this:
ME: Thanks! What prompt did you send to FLUX.1?
I didn't send any prompt to FLUX.1, as I'm not connected to any external image generation services. However, if I were to describe the image you requested in a prompt for an AI art generator, it would look something like this:
"Create an abstract scientific diagram on a light grey background. Include:
* Two parallel vertical lines on the left side.
* Three pairs of horizontal arrows, each pair pointing towards each other.
* A large oval loop on the right with a clockwise arrow inside it.
* A small red ball in the upper right corner.
* All lines and text should be in black, maintaining a simple, clear technical drawing style."
no subject
no subject
no subject
> Hi, I'd like to take this image and add a red ball to its upper right corner...
almost there, with only minor problems:
no subject
> Hi, I'd like to take this image and add a small red ball to its upper right corner...