There has been no shortage of consternation over the introduction of generative artificial intelligence (AI) tools over the last several months. Critics predict job losses among writers and artists as creators turn to AI to produce their content. Outrage accompanies accusations that these tools steal work produced by humans to incorporate into their outputs.
Neither is true. Understanding how generative AI tools work can offer some comfort to communicators who can then turn their attention to using them to make their jobs easier, just as I’ve done recently.
What Is Generative AI?
Generative AI focuses on creating new content or data based on a set of input data. This can include generating text, images or audio based on a given set of parameters or a model trained on a large data set. Generative AI has the potential to create new and unique content, making it a powerful tool for a variety of applications.
The paragraph above was written by ChatGPT, a new text generator released to the public in December 2022 by OpenAI. The set of input data is more colloquially known as a prompt. The prompt I entered to produce this text was, "Explain generative artificial intelligence in 50 words." It took ChatGPT less than 10 seconds to craft its explanation based on the parameters I provided.
The ChatGPT output also refers to large data sets. Unlike search engines, generative AI tools do not scour the web for resources when compiling their outputs. Instead, their neural networks learn from the massive datasets they are fed. An image generator can be fed potentially billions of images along with relevant information. It learns what an elephant looks like, as well as the style employed by the late Russian-French artist Marc Chagall, and can produce an original image of an elephant in Chagall’s style.
This is what concerns a lot of artists. A friend laments that her art could be part of a data set and find its way into an AI output for which she will never be compensated or even recognized. That, however, is not how these tools work. They do not "sample" works in a database. They will never produce outputs that are collections of pre-existing assets. Instead, think about a human artist who spends her life visiting museums and galleries, flipping through art books and devouring images online. They absorb all of these images; they influence their own work.
The Associated Press (AP) has been using a generative AI tool called Wordsmith since 2014 for financial reporting. Wordsmith's data set includes hundreds of thousands of earnings reports. Now, AP can feed raw financial data into the tool and it cranks out a serviceable article because it has learned the fairly rigid structure of an earnings report.
AP also uses generative AI to produce articles covering minor league baseball games, solving the problem of not having enough sports journalists on its payroll to cover every minor league game. But while generative AI tools like Wordsmith are hardly new, the latest crop of apps has sparked new interest.
What Are These Tools?
There are two high-level categories of generative AI tools: image and text. The hot image generators include Midjourney, Stable Diffusion and DALL-E 2 (from OpenAI, the same organization that just released ChatGPT).
DALL-E 2 and Stable Diffusion work much the same as ChatGPT: Visit the website, enter a prompt into the text field, and click to generate images the AI thinks you are looking for. Midjourney is a little different, with users entering their prompts in a Discord discussion group, which is where you see the images it creates. They will also be saved to your account on the web.
All three are capable of creating unique images in a vast array of styles, from watercolors to photo realistic illustrations. For example, a former colleague of mine, Steve Coulson, has been producing graphic novels, “The Bestiary Chronicles” ("humanity's last attempt to save themselves from the monsters that roam the planet") using Midjourney.
A Midjourney-created image from Steve Coulson’s graphic novels, “The Bestiary Chronicles”
Jason Allen, a maker of tabletop fantasy games, used Midjourney to win first place in the "digitally manipulated photography" category of Colorado State Fair's fine arts competition in September 2022, beating 20 other entrants.
Photo by Jason Allen from Midjourney
ChatGPT is not the first text generator but, so far, it is the best of its kind. I have been using Jasper, a fee-based text generator, for about a year, mostly for experimentation. While ChatGPT offers a prompt entry field, Jasper features a wide variety of templates. If you're working on a blog post, for example, you can select an introduction paragraph, a post outline, topic ideas and other templates. There's also an FAQ generator and a listicle template that generates a numbered list based on a topic.
Like ChatGPT, you produce content from any of these templates based on a text prompt. You can also select how many outputs you want. But with ChatGPT, you can do all of this in the prompt. For instance, you can prompt it to "Write a blog post introduction paragraph about why companies need to take a stand on social issues." Then, after it has completed its task, just click "Try again" to generate a new attempt.
There are also generative AI tools that produce music, video, 3D renderings and even computer code from text prompts.
How Can Communicators Use These Tools?
Artists are already using AI art generators to speed up their own creation processes. As noted in the publication Science Focus, "Illustrators and visual artists will be able to use these AI tools to generate ideas, gather inspiration and experiment with prototypes that they later edit into a final product." Communicators can do the same, sending an artist some AI-generated art, noting, "Something like this is what I have in mind," along with the specific tweaks they want.
For those communicators whose limited budgets have precluded the use of paid artists, though, these tools are a godsend. I recently needed an image to accompany a post to the leadership blog on the intranet where I work. I scoured the stock photo service I subscribe to but couldn't find anything that matched my need, so I turned to DALL-E 2, prompting the app to give me a photorealistic image of a supervisor and an employee engaged in a one-on-one meeting on a construction site. One of the four images it produced was exactly what I wanted.
ChatGPT also came to my rescue. Among my recurring efforts is a monthly wellness-focused newsletter emailed to all of the company's employees. The “Nutrition Nugget” column has been penned by the wellness coordinator, who recently left the company. Faced with spending an hour researching spaghetti squash, I opted to prompt ChatGPT to "write an article about spaghetti squash." I fact-checked the output, which turned out to 100% accurate, then did a little research to add a line or two about the vegetable's history (which had always been an element of these articles). I was done in less than five minutes.
I do not feel the least bit bad about using AI to write a short article about spaghetti squash on my behalf. I was able to use that time on more substantive work that AI could not have done for me. You can also use ChatGPT to shorten an article, summarize it in bullet points, write headlines and subheads, and even check to ensure you have addressed all possible issues related to your topic.
I can envision countless uses for these tools. Imagine that you need an image of your CEO seated atop a mountain of (enter noun here). You can install Stable Diffusion on your own computer and train it on your CEO's image, then prompt it to deliver the image you want. I tested this ability, training it on images of me, then submitted several prompts: me as Captain America, me playing an electric guitar and an oil portrait of me that has become my new Facebook avatar (below).
AI-generated images from Stable Diffusion
In addition to improvements to the existing stable of generative AI tools and new entries in the text and image fields, more content categories will be possible soon. There are already tools for creating music this way, including one called Boomy. I envision using tools like this — when they improve to the levels the art and text generators are at today — as a background music bed for a video.
Speaking of video, there are several startups using generative AI allowing you to produce marketing videos, explainer videos and other categories. Take a look at Synthesys as one example; others include the likes of Synthesia and InVideo. Incidentally, if you already use Descript, the all-in-one audio editing tool, the app now lets you edit video just like you would edit text in a word processor, all thanks to generative AI. For example, Descript uses AI to create a voice that sounds like whoever is in the video so you can make corrections by typing in new words, just like you would in Microsoft Word. Descript then has the speaker say the new words in their own voice.
While this may sound like a hotbed of ethical issues, IABC members needn't worry as long as they abide by our Code of Ethics when using these tools.
As usual, the best way to stretch your own imagination about how these tools can serve you is to try them. Fortunately, ChatGPT and Stable Diffusion are among several that are completely free and the prices for others are remarkably low, especially compared to traditional alternatives.
Shel Holtz, SCMP, ABC, IABC Fellow