An Introduction to Large Language Models and Generative AI
The genie is out of the bottle
Introduction
I’m going to take a break from my testing blog series briefly to devote a few blogs to Artificial Intelligence (AI).
AI has become mainstream. One of its first public demonstrations came in 2011 when IBM Watson beat the two best human contestants in the popular TV quiz show Jeopardy!. AI is appearing constantly in news reports, blogs, etc. often with dire warnings, such as with the misdirection of deepfakes. AI was a major point of contention with the 2023 contract negotiations between the actor and writer unions and the production studios.
This blog series will be limited to a subset of AI: Large Language Models (LLMs) and Generative AI (GenAI).
NOTE: I may interchange LLM and GenAI to some degree in these blogs. They are related, but there is a distinction between them. LLM is a technology that creates relationships between words based upon a large set of training data. GenAI is a process involving prompts that access LLM relationships to create custom results.
LLMs and GenAI may change the world. They may be a flash in the pan. Right now they’re peaking on the Gartner Hype Cycle. Regardless, the genie has been let out of the bottle. Don’t stick your head in the sand. LLMs and GenAI are going to continue to encroach upon our lives. It’s probably wise to have some understanding of them.
I’ve audited a few online LLM/Gen-AI courses, and I wanted to share what I’ve learned. These LLM and GenAI blogs will be geared more toward the lay person than my usual software engineering-oriented blog entries. I won’t delve into the algorithms that create LLMs, because quite frankly, I don’t know the technical side well enough to write about it yet.
ChatGPT Shakes The Etch-A-Sketch
LLMs have been in the software engineering zeitgeist for a few years. It hit the mainstream in November 2022 when OpenAI released a GenAI, named ChatGPT, upon the world. Anyone could experiment with GenAI easily.
I remember playing around with ChatGPT a bit when it was first available. I’d ask it to describe a software engineering concept, such as Hexagonal Architecture or a Design Pattern. It did a reasonable job even if the summary’s language was a bit stilted. Then for a goof, I’d ask it to describe it as if written by Dr. Seuss, which usually evoked a chuckle out of me.
I continued to experiment with GenAI, but I never used it for anything serious. The online courses I recently audited focused upon prompt engineering, which are patterns that help produce good results from GenAI. I tried some prompt engineering patterns as I was learning them, and in some cases I was gobsmacked by the results.
Using the chef analogy from long ago, the LLM would be like a well-stocked kitchen. GenAI would be like the chef who creates a bespoke meal from that well-stocked kitchen based upon the customer’s requests. The meal would be the result. That meal could be great or end up tasting like garbage.
A Brief History Of Accessing Knowledge Organization
LLM technology is the next advancement in accessing knowledge on the World Wide Web, or Web for short. Long before the Web, human knowledge was stored on paper in filing cabinets, binders, books, libraries, etc. Web knowledge acquisition mirrors paper knowledge acquisition, which I’ll use as a metaphor.
Used Bookstore
Used bookstores might have some organization to them, but each bookstore is unique in how it organizes books.
The early days of Web was like this. People posted pages anywhere they could host them. It could be challenging to find what you were looking for. Once you found it, you had to make a note of its location, because there was a good chance you might never find it again.
Public Library
Public libraries standardized the organization of books. The Dewey Decimal System is still in use in libraries. A card catalog lists each book on an index card with a brief description and where to find it in the stacks.
Early attempts to organize the Web were based upon categories and rudimentary search engines akin to card catalogs. They produced broad results from which the user still had to do a lot of weeding. The best website you needed might be quite low on the list of responses.
Reference Librarian
Most libraries have a reference librarian, who often has a degree in library science. Ask a reference librarian for assistance and provide some details about your quest, and he or she will be able to send you to specific locations in the library to find what you desire.
Google was the first Web reference librarian. Google provided ranked page results. Websites near the top of its results page were quite often exactly what we needed.
Google achieved this via the relationship among websites. Most websites contain URL links referencing other websites. When one website contains a link to another website, it implicitly endorses that linked website. Google counts the number of endorsements a website receives from other websites. The more endorsements a website receives the more credibility that website has and therefore, it’s more likely to rank higher. If a highly credible website links to another website, then that other website obtains higher credibility as well, much like a celebrity endorsement.
Google’s page rank algorithm is much more complex, but that’s the basic gist.
The Law Clerk
I’m going to shift from the public library to the law library.
Judges may have a staff of law clerks who research and draft legal documents. The clerks dig through law libraries looking for precedence pertinent to a specific case. Rather than providing the judge with a list of case law to consider, they aggregate that information and produce a report for the judge with references so that the judge can investigate further if desired.
GenAIs are like law clerks. They don’t return websites. They construct bespoke responses. They don’t provide a ranked list of websites, which might answer your question. They provide a customized answer.
But unlike human law clerks, who have some degree of competency, we can’t always depend upon GenAI to produce accurate results. A lawyer in a federal case learned this the hard way in 2023, when he used ChatGPT as an electronic law clerk to generate his legal brief. ChatGPT generated fake cases. The judge was not amused. See: Lawyer cites fake cases generated by ChatGPT in legal brief.
GenAI can do more than generate reports. GenAI can emulate personas, including real people. We can interact with GenAI without it losing context, which is something Google search cannot do.
Google’s lack of market dominance in this next phase of knowledge organization and acquisition probably keeps their executive awake at night.
Will AI Save The World Or Destroy It?
The media is shouting from both extremes from mountain tops:
- AI will save the world!
- AI will destroy the world!
LLMs and GenAI are tools. Like most tools, their impact depends upon the intent of the hand wielding the tool. In the case of AI, I think the concern is a bit beyond that. There’s the concern of the unintended consequences that could arise if we unleash AI. There’s the long-standing Science-Fiction trope of AI being given the task to ensure world peace, which it completes by exterminating all humanity. Fortunately, we have lived past Skynet’s self-aware date of 1997.
I don’t feel that either AI extreme is likely to occur. AI will be a tool that automates knowledge consumption, and at some point, it will feel as commonplace as previous disrupting technologies, such as: telephones, smartphones, internet, cars, air travel, etc.
This was debated informally in a Planet Money podcast episode: Is AI overrated or underrated?
Will AI Destroy The Environment?
While LLMs and GenAI may not destroy the world in a manner befitting a Science-Fiction movie, it requires a significant amount of electricity, which could contribute to climate change.
The Communications of the ACM published a blog, The Energy Footprint of Humans and Large Language Models, in June 2024:
The cost of training a big foundation model can be daunting. The electricity required to train GPT-3 was estimated at around 1,287,000 kWh. Estimates for Llama 3 are a little above 500,000 kWh, a value that is in the ballpark of the energy use of a seven-hour flight of a big airliner. However, in contrast to a single long flight that is not reusable, a foundation model once trained will instantiate a set of weights that can be shared and reused in many different instances.
… foundational LLMs are perfectly cloneable and can be a base for finetuning towards more specific tasks. This allows for quickly amortising the training costs they incur.
IEEE Spectrum published an article, Generative AI’s Energy Problem Today Is Foundational: Before AI can take over, it will need to find a new approach to energy, in October 2023:
[… the] popularity [of GenAI] comes at a steep energy cost, a reality of operations today that is often shunted to the margins and left unsaid.
[Alex de Vries, Ph.D. candidate at VU Amsterdam and founder of the digital-sustainability blog Digiconomist,] predicted that current AI technology could be on track to annually consume as much electricity as the entire country of Ireland (29.3 terawatt-hours per year).
While the training process of these LLMs typically receives the brunt of environmental concern—models can consume many terabytes of data and use over 1,000 megawatt-hours of electricity.
As for users asking ChatGPT to write silly stories or generate fantastical photos, … these individual habits are not going to make or break the environmental impact of these products. That said, thinking and speaking critically about the impact of these products could help push the needle in the right direction as developers work behind the scenes.
Are LLM Training Data Obtained Legally or Ethically?
LLMs only work if they are built from large amounts of training data, which is obtained from the internet. Though content on the internet may be accessible, that doesn’t mean that it isn’t also protected as intellectual property.
How much training data is legally and ethically sourced? How much training data is scraped from intellectual property without compensation? Many artists have been outspoken that LLM technology is denying them their livelihoods when trained upon their works. Why pay an artist when the LLM can generate something in the style of that artist? See: Artists take new shot at Stability, Midjourney in updated copyright lawsuit.
Maybe the LLMs should pay for access to copyrighted content. See: Legal cases question IP in large language model training.
What About Generation Loss?
Did you ever make a copy of a copy on a copier and then copy that copy, etc.? Each new copy derives from the original, but imperfections quickly amass. Any smudges on the copier’s surface and other imperfections will propagate. Eventually the last link of the chain will have degraded so much that its similarities to the original will be nil.
GenAI is a copy of language patterns derived from an LLM. It is not an original human creation. GenAI content is being posted to the internet. Future LLMs are being trained on it. Those LLMs will generate content that will be posted and used as training data for LLMs after that one. Each generation of LLM has the risk of being further separated from content created by humans. Like the copied copies, it runs the risk of similar degradation.
The Next Most Likely Word
We’ve all encountered autocomplete in our technology, whether tapping out a text, searching for something on Google or writing a document. The computer will sometimes suggest the next word. Sometimes it’s exactly what we want, other times not.
GenAI is like autocomplete on steroids. GenAI returns the next most likely word repeatedly.
LLMs are statistical models that look for predictable patterns among words. Those patterns are based upon the nearly endless set of examples from the internet, ethically sourced or not. LLMs reflect the aggregated content posted by humans for the past several decades including the good, the bad and the ugly. If you’ve read something on the internet, there’s a good chance that it has been or will be absorbed by LLMs. This blog will eventually be absorbed into an LLM.
The acquisition of language patterns by repeated examples is known as a Stochastic Parrot. Like a real parrot, GenAI is repeating back what it’s heard from us without any understanding.
What do you suppose is the next most likely word is in the following:
- Mary had a little _____
- Four score and seven _____
- To be, or _____
There’s a good chance your responses were: lamb, years, and not, respectively. Many readers could probably continue with the next set of words in these phrases as well.
If you’re a native English speaker, you’ve heard these phrases dozens if not hundreds of times. They’re engrained in your mind. However, there is no correct word for any of these phrases. These are just the most likely words from that repeated exposure. Had I asked for the next word in the nursery rhyme, Lincoln’s Gettysburg Address or Hamlet’s soliloquy, then those responses would have been considered correct.
GenAI would probably respond with the same words for the same reasons most English speakers would. People and LLMs have both been trained with similar data. The LLM has been trained with dozens if not hundreds of examples of these phrases.
People are constantly being trained at a rate of about 25,000 words per day, whereas LLMs are trained with billions of words in bulk. LLM algorithms analyze Large samples of Language to create their Models, which predict the next most probably word. GenAIs produce such human-like convincing responses, because their LLMs have been trained with so many examples created by humans. ChatGPT’s training set is in the 100s of billions of parameters.
Are LLMs/GenAIs Sentient?
GenAI often produces results that suggest sentience or consciousness in the LLM. I may use words in these blogs that reinforce this notion in LLMs, such as: thinks, considers, plans, reasons, etc. I will only do this as a convenience of language. I do not believe there is an underlying sentience or consciousness in LLMs. GenAI responses appear sentient, because they mirror the sentience of human language.
Humans can’t define sentience or consciousness for ourselves. It feels disingenuous to declare with absolute certainty whether LLMs and GenAIs do or do not have sentience or consciousness when we can’t declare with absolute certainty whether people do either.
My thoughts consist of a continuous stream of words in my head. What if my thoughts are only my brain producing the next most likely word based upon my hearing 25,000 words each day for the past six decades? Ever notice that long term memory only seems to start in children as they start to acquire language?
The following is part of a conversation between host, David Kestenbaum, and his guest, Peter Lee, head of Microsoft Research, about the nature of intelligence from This American Life podcast: Greetings, People Of Earth:
David Kestenbaum’s introduction to the following conversation: Peter told me he’s still not willing to say [GenAI] truly understands. And yet, it was doing all this [understanding behavior]. It made him question so many things about how he thought intelligence worked. How did this machine do this if it was just predicting the next word?
Peter Lee: It does make me wonder how much of our intelligence is truly complicated and special.
David Kestenbaum: I mean, you get something that’s not far from it by just saying, what’s the next word?
Peter Lee: That’s the disturbing bit about this. And again, you know, to ask you, what are we doing in this conversation right now? Are we kind of making it up on the fly one word at a time? Every nerve and bone in my body says, no, we’re thinking far ahead, we’re learning on the fly, all these other things that we think that we’re doing. And we probably are, in some ways. But maybe a big chunk of intelligence is a lot simpler than we think and a lot less special than we think.
Caveat Lector!
GenAI generates the next most likely word. But that doesn’t mean that the next most likely word is the best word or even a good one. It may be quite bad.
GenAI will generate the next most likely word regardless of how good or bad the word is. This is known as a hallucination, but I think that confabulation would have been a better term. GenAI will rarely respond with: I don’t know, or I can’t answer that.
GenAI is like the student who didn’t study for a test and then submits anything hoping for at least some partial credit. However, unlike the student whose lack of knowledge is usually obvious, GenAI is a convincing confabulator. If its incorrect responses are challenged, it will apologize, and then it will often provide a new equally incorrect response. It’s notorious for returning fabricated experts, publications and URLs.
It would be nice if GenAI provided references used in constructing its responses or at its response confidence level. LLMs don’t maintain references. I previously asked what you thought what word comes next in: Mary had a little … . I’m sure almost everyone thought lamb. How do you know that’s the next word? Can you provide a verifiable reference from within your mind? Do you remember the first time you heard that nursery rhyme? Probably not.
Just as we can’t provide references for a lot of what we know, GenAI cannot either. It’s similar for confidence level. How much confidence do you assign to the next word in your stream of consciousness? Future advancements in LLMs may account for reference and confidence level, but it’s either not there, or it’s not being displayed.
Even if the generated content is correct, its responses are teetering upon the precipice of the uncanny valley. The responses will be syntactically and semantically correct, but they will be a bit stilted. They will feel mechanically artificial.
We should observe the old Russian Proverb: Trust, but verify. Let the reader beware! Never accept GenAI results as is. Always review them for correctness or naturalistic form. Apply critical thinking. Modify the responses as necessary if shared with others.
GenAI will refuse to respond in some cases, such as those dealing with:
- Information that’s newer than its training data. For example, ChatGPT 3.5’s training cutoff date was September 2021. This may not be as much of a restriction as before. I asked ChatGPT for information about 2024 Paris Olympics and the Trump/Harris debate in September 2024. It replied with brief answers, and its results suggested that it was performing live internet searches to produce those results. NOTE: I found a YouTube video, ChatGPT Pre-Prompt Text Leaked, which provides some insight on this. ChatGPT prepends its own prompts, which include telling it that it has access to browse, so that it can access the internet if needed. This isn’t traditional code. These are prompts written in natural English telling ChatGPT what it knows and how it use that knowledge.
- Specific websites. For example, I’ve provided ChatGPT with the URL for the landing page for my blogs and asked it to answer questions based upon the blogs listed in that URL. It refuses to do so. This would be a nice feature, especially when dealing with information that’s newer than its training data.
- Sensitive topics, such as: sexual abuse, violence, racism and sexism.
Will GenAI Take Our Jobs?
Is GenAI going to take our jobs? The Planet Money podcast explored this in a three-part series where they created a podcast completely generated from AI in Planet Money Makes an Episode Using AI.
While GenAI won’t take all our jobs, it will change the nature of many jobs. Those who know how to leverage GenAI will take the jobs of those who do not. Those who learn how to harness GenAI, especially as it improves, will remove much toil from their lives.
Long before GenAI, Planet Money produced a podcast episode in 2015 about Spreadsheets, and their impact upon the accounting profession.
The spreadsheet both killed jobs, and it created jobs. While the number of bookkeepers and accounting clerks fell, the electronic spreadsheet actually meant more jobs for regular accountants.
What happened is that accounting basically became cheaper. And sometimes, when something gets cheaper, people buy a lot more of that thing. Clients bought more accounting. They asked to run more numbers for them.
Spreadsheets removed accounting toil. How many professional accountants do you think are still in business who never learned to use Spreadsheets?
GenAI is going to do the same for all knowledge workers. The Communications of the ACM published The Impact of AI on Computer Science Education in July 2024. While it focuses upon the impact of AI upon computer science, it addresses the impact beyond computer science as well.
A whopping 90% of jobs will be disrupted in some way by generative AI. Whereas in the past technological advances and automation have mainly impacted manual labor and process-centric knowledge work, the study found that “Generative AI is poised to do the opposite, having a higher disruption on knowledge work.”
AI won’t take away jobs, but it will change the nature of jobs. They will change the need to do certain things manually, just as word processing eliminated the need for manually typing things.
There is already a big demand for people who can do prompt engineering, interacting with LLMs to get the information you want. That’s a job that didn’t exist two years ago.
For AI to be successful at augmentation [of human capabilities], humans have to determine how systems are designed, the role of the augmenter, how the human’s job description changes, and how to create a successful partnership.
Summary
LLMs and GenAI are recent developments in computing, especially for the general public. They could be the next leap in knowledge acquisition but jump with care. GenAI will confidently and convincingly confabulate results. And remember that there’s no there, there. The appearance of independent thought is an illusion.
Those who learn how to leverage LLMs and GenAI will have a competitive advantage over those who do not. LLMs and GenAI, like most tools, are force multipliers. Those with a good foundation and the knowledge to know how to use them well will be more productive. Those who don’t will create a bigger mess faster.
LLMs may contain nuggets of gold trapped within layers of useless rock. What if the cure to cancer exists in snippets of knowledge in various medical and science journals too vastly distributed for any one person find and connect? What if those snippets reside in an LLM and some medical student begins the right GenAI session that connects those distributed snippets?
We need picks and shovels to find those nuggets of gold. My next blog or two will focus upon prompt engineering patterns, which are the picks and shovels that can help us release meaningful information from LLMs using GenAI.
References
Here are the online courses I mentioned:
- Generative AI: Prompt Engineering Basics - Provided by IBM. Nice quick overview
- Prompt Engineering for ChatGPT - Provided by Vanderbilt University. More depth with prompt engineering patterns that the instructor demonstrates live on ChatGPT. This is the first course of a series of three. I have enrolled in the other two, but I have not started them yet.
There are many freely available LLMs/GenAIs on the internet. Here are several to explore:
- OpenAI – This is a portal to OpenAI’s models
- ChatGPT – A conversational GenAI
- Gemini - Google’s GenAI
- Copilot - Microsoft’s GenAI
- Perplexity.ai – A conversational GenAI with references and more specific related question/links
- Phind – A conversational GenAI similar to Perplexity.ai
- Pi.ai – A Personal AI