Saturday, April 1, 2023
No Result
View All Result
  • Login
NEWSLETTER
Google Publishers
  • World
    • Africa
    • China
    • Asia
    • Australia
    • India
      • राजनीति
      • राशिफल
    • Europe
    • Middle East
    • United Kingdom
  • Politics
  • Lifestyle
    • All
    • Diet and Weight Loss
    • Fashion
    • Health
    • Relationships
    Two men shot dead in hail of bullets on Queens street: police

    Two men shot dead in hail of bullets on Queens street: police

    America’s downtown centers wither amid fading hopes for a post-pandemic office boom

    America’s downtown centers wither amid fading hopes for a post-pandemic office boom

    16 Derm-Recommended Products To Use ASAP If You’ve Been Neglecting Your Skin

    16 Derm-Recommended Products To Use ASAP If You’ve Been Neglecting Your Skin

    Yes, ‘Dungeons & Dragons: Honor Among Thieves’ Has a Post-Credits Scene

    Yes, ‘Dungeons & Dragons: Honor Among Thieves’ Has a Post-Credits Scene

    Squirrels live longer in leafier parts of London, air pollution study shows

    Squirrels live longer in leafier parts of London, air pollution study shows

    Westover: In Ottawa, registering a child in swimming classes is a sport unto itself

    Westover: In Ottawa, registering a child in swimming classes is a sport unto itself

    Bob Raissman: Listen up! Nobody wants to see video of broadcasters during games

    Bob Raissman: Listen up! Nobody wants to see video of broadcasters during games

    Billions needed to avert unrest and starvation, U.N. food chief says

    Billions needed to avert unrest and starvation, U.N. food chief says

    Today’s letters: Keep the Senators in Kanata, build affordable housing or a park at LeBreton Flats

    Today’s letters: Keep the Senators in Kanata, build affordable housing or a park at LeBreton Flats

    Healing starts in the lobby at a Barrington hospital, where volunteer musicians play their passion

    Healing starts in the lobby at a Barrington hospital, where volunteer musicians play their passion

    Trending Tags

    • Golden Globes
    • Mr. Robot
    • MotoGP 2017
    • Climate Change
    • Flat Earth
    • food
    • Fashion
    • Diet and Weight Loss
    • Mindfulness
    • Relationships
  • Entertainment
    • All
    • Celebrities
    • Gaming
    • Music
    • Sports
    MTV gong from the ‘80s sells for more than $15K at auction

    MTV gong from the ‘80s sells for more than $15K at auction

    ‘Pop’ culture, international stars mark Hall class

    ‘Pop’ culture, international stars mark Hall class

    PS5 vs Xbox Series X|S Sales Comparison in the US – February 2023

    PS5 vs Xbox Series X|S Sales Comparison in the US – February 2023

    High school baseball and softball: Friday’s scores

    High school baseball and softball: Friday’s scores

    Abby Lee Miller reveals warning she gave to Todd Chrisley before prison sentence

    Abby Lee Miller reveals warning she gave to Todd Chrisley before prison sentence

    See Celeb Kids Participating in Silly April Fools’ Day Jokes Over the Years

    See Celeb Kids Participating in Silly April Fools’ Day Jokes Over the Years

    Jeremy Sisto says New Yorkers don’t have time to wait for his show ‘FBI’

    Jeremy Sisto says New Yorkers don’t have time to wait for his show ‘FBI’

    ‘Dungeons & Dragons’ Firing Up $40M Opening, Nabs A- CinemaScore – Saturday Box Office

    ‘Dungeons & Dragons’ Firing Up $40M Opening, Nabs A- CinemaScore – Saturday Box Office

    ‘More than just shoes’: how Air Jordans kicked off a revolution in sport

    ‘More than just shoes’: how Air Jordans kicked off a revolution in sport

    • Celebrities
    • Gaming
    • Movie
    • Music
    • Television
  • Sports
  • Business
    • Market
    • Media
    • Perspectives
    • Success
    • Tech
    • Videos
  • Travel
    • Destinations
    • Food & Drinks
    • Stay
  • Style
    • Architecture
    • Arts
    • Beauty
    • Design
    • Luxury
  • Tech
    Donald Trump’s NFT trading cards jumped in value to almost $1,700 after Manhattan indictment

    Donald Trump’s NFT trading cards jumped in value to almost $1,700 after Manhattan indictment

    Top 5 reasons to update to iOS 16.4 on your iPhone today

    Top 5 reasons to update to iOS 16.4 on your iPhone today

    The Best Barefoot Shoes for Walking or Running

    The Best Barefoot Shoes for Walking or Running

    Arsenal vs Leeds United Live Stream: Watch the Game for Free | Digital Trends

    Arsenal vs Leeds United Live Stream: Watch the Game for Free | Digital Trends

    Google launches a beta Nearby Share app for Windows PCs | Engadget

    Google launches a beta Nearby Share app for Windows PCs | Engadget

    Lemon8 is a Chinese app. Can it survive the hype cycle?

    Lemon8 is a Chinese app. Can it survive the hype cycle?

    Trending Tags

    • Sillicon Valley
    • Climate Change
    • Election Results
    • Flat Earth
    • Golden Globes
    • MotoGP 2017
    • Mr. Robot
    • AI World
    • Future
    • Gadget
    • Innovate
    • Innovative Cities
  • World
    • Africa
    • China
    • Asia
    • Australia
    • India
      • राजनीति
      • राशिफल
    • Europe
    • Middle East
    • United Kingdom
  • Politics
  • Lifestyle
    • All
    • Diet and Weight Loss
    • Fashion
    • Health
    • Relationships
    Two men shot dead in hail of bullets on Queens street: police

    Two men shot dead in hail of bullets on Queens street: police

    America’s downtown centers wither amid fading hopes for a post-pandemic office boom

    America’s downtown centers wither amid fading hopes for a post-pandemic office boom

    16 Derm-Recommended Products To Use ASAP If You’ve Been Neglecting Your Skin

    16 Derm-Recommended Products To Use ASAP If You’ve Been Neglecting Your Skin

    Yes, ‘Dungeons & Dragons: Honor Among Thieves’ Has a Post-Credits Scene

    Yes, ‘Dungeons & Dragons: Honor Among Thieves’ Has a Post-Credits Scene

    Squirrels live longer in leafier parts of London, air pollution study shows

    Squirrels live longer in leafier parts of London, air pollution study shows

    Westover: In Ottawa, registering a child in swimming classes is a sport unto itself

    Westover: In Ottawa, registering a child in swimming classes is a sport unto itself

    Bob Raissman: Listen up! Nobody wants to see video of broadcasters during games

    Bob Raissman: Listen up! Nobody wants to see video of broadcasters during games

    Billions needed to avert unrest and starvation, U.N. food chief says

    Billions needed to avert unrest and starvation, U.N. food chief says

    Today’s letters: Keep the Senators in Kanata, build affordable housing or a park at LeBreton Flats

    Today’s letters: Keep the Senators in Kanata, build affordable housing or a park at LeBreton Flats

    Healing starts in the lobby at a Barrington hospital, where volunteer musicians play their passion

    Healing starts in the lobby at a Barrington hospital, where volunteer musicians play their passion

    Trending Tags

    • Golden Globes
    • Mr. Robot
    • MotoGP 2017
    • Climate Change
    • Flat Earth
    • food
    • Fashion
    • Diet and Weight Loss
    • Mindfulness
    • Relationships
  • Entertainment
    • All
    • Celebrities
    • Gaming
    • Music
    • Sports
    MTV gong from the ‘80s sells for more than $15K at auction

    MTV gong from the ‘80s sells for more than $15K at auction

    ‘Pop’ culture, international stars mark Hall class

    ‘Pop’ culture, international stars mark Hall class

    PS5 vs Xbox Series X|S Sales Comparison in the US – February 2023

    PS5 vs Xbox Series X|S Sales Comparison in the US – February 2023

    High school baseball and softball: Friday’s scores

    High school baseball and softball: Friday’s scores

    Abby Lee Miller reveals warning she gave to Todd Chrisley before prison sentence

    Abby Lee Miller reveals warning she gave to Todd Chrisley before prison sentence

    See Celeb Kids Participating in Silly April Fools’ Day Jokes Over the Years

    See Celeb Kids Participating in Silly April Fools’ Day Jokes Over the Years

    Jeremy Sisto says New Yorkers don’t have time to wait for his show ‘FBI’

    Jeremy Sisto says New Yorkers don’t have time to wait for his show ‘FBI’

    ‘Dungeons & Dragons’ Firing Up $40M Opening, Nabs A- CinemaScore – Saturday Box Office

    ‘Dungeons & Dragons’ Firing Up $40M Opening, Nabs A- CinemaScore – Saturday Box Office

    ‘More than just shoes’: how Air Jordans kicked off a revolution in sport

    ‘More than just shoes’: how Air Jordans kicked off a revolution in sport

    • Celebrities
    • Gaming
    • Movie
    • Music
    • Television
  • Sports
  • Business
    • Market
    • Media
    • Perspectives
    • Success
    • Tech
    • Videos
  • Travel
    • Destinations
    • Food & Drinks
    • Stay
  • Style
    • Architecture
    • Arts
    • Beauty
    • Design
    • Luxury
  • Tech
    Donald Trump’s NFT trading cards jumped in value to almost $1,700 after Manhattan indictment

    Donald Trump’s NFT trading cards jumped in value to almost $1,700 after Manhattan indictment

    Top 5 reasons to update to iOS 16.4 on your iPhone today

    Top 5 reasons to update to iOS 16.4 on your iPhone today

    The Best Barefoot Shoes for Walking or Running

    The Best Barefoot Shoes for Walking or Running

    Arsenal vs Leeds United Live Stream: Watch the Game for Free | Digital Trends

    Arsenal vs Leeds United Live Stream: Watch the Game for Free | Digital Trends

    Google launches a beta Nearby Share app for Windows PCs | Engadget

    Google launches a beta Nearby Share app for Windows PCs | Engadget

    Lemon8 is a Chinese app. Can it survive the hype cycle?

    Lemon8 is a Chinese app. Can it survive the hype cycle?

    Trending Tags

    • Sillicon Valley
    • Climate Change
    • Election Results
    • Flat Earth
    • Golden Globes
    • MotoGP 2017
    • Mr. Robot
    • AI World
    • Future
    • Gadget
    • Innovate
    • Innovative Cities
No Result
View All Result
Google Publishers
No Result
View All Result
Home Entertainment

ChatGPT Is a Blurry JPEG of the Web

by Google Publishers
February 9, 2023
in Entertainment
0
ChatGPT Is a Blurry JPEG of the Web
0
SHARES
0
VIEWS
Share on FacebookShare on TwitterShare on Tumblr

In 2013, workers at a German construction company noticed something odd about their Xerox photocopier: when they made a copy of the floor plan of a house, the copy differed from the original in a subtle but significant way. In the original floor plan, each of the house’s three rooms was accompanied by a rectangle specifying its area: the rooms were 14.13, 21.11, and 17.42 square metres, respectively. However, in the photocopy, all three rooms were labelled as being 14.13 square metres in size. The company contacted the computer scientist David Kriesel to investigate this seemingly inconceivable result. They needed a computer scientist because a modern Xerox photocopier doesn’t use the physical xerographic process popularized in the nineteen-sixties. Instead, it scans the document digitally, and then prints the resulting image file. Combine that with the fact that virtually every digital image file is compressed to save space, and a solution to the mystery begins to suggest itself.

Compressing a file requires two steps: first, the encoding, during which the file is converted into a more compact format, and then the decoding, whereby the process is reversed. If the restored file is identical to the original, then the compression process is described as lossless: no information has been discarded. By contrast, if the restored file is only an approximation of the original, the compression is described as lossy: some information has been discarded and is now unrecoverable. Lossless compression is what’s typically used for text files and computer programs, because those are domains in which even a single incorrect character has the potential to be disastrous. Lossy compression is often used for photos, audio, and video in situations in which absolute accuracy isn’t essential. Most of the time, we don’t notice if a picture, song, or movie isn’t perfectly reproduced. The loss in fidelity becomes more perceptible only as files are squeezed very tightly. In those cases, we notice what are known as compression artifacts: the fuzziness of the smallest JPEG and MPEG images, or the tinny sound of low-bit-rate MP3s.

Xerox photocopiers use a lossy compression format known as JBIG2, designed for use with black-and-white images. To save space, the copier identifies similar-looking regions in the image and stores a single copy for all of them; when the file is decompressed, it uses that copy repeatedly to reconstruct the image. It turned out that the photocopier had judged the labels specifying the area of the rooms to be similar enough that it needed to store only one of them—14.13—and it reused that one for all three rooms when printing the floor plan.

The fact that Xerox photocopiers use a lossy compression format instead of a lossless one isn’t, in itself, a problem. The problem is that the photocopiers were degrading the image in a subtle way, in which the compression artifacts weren’t immediately recognizable. If the photocopier simply produced blurry printouts, everyone would know that they weren’t accurate reproductions of the originals. What led to problems was the fact that the photocopier was producing numbers that were readable but incorrect; it made the copies seem accurate when they weren’t. (In 2014, Xerox released a patch to correct this issue.)

I think that this incident with the Xerox photocopier is worth bearing in mind today, as we consider OpenAI’s ChatGPT and other similar programs, which A.I. researchers call large-language models. The resemblance between a photocopier and a large-language model might not be immediately apparent—but consider the following scenario. Imagine that you’re about to lose your access to the Internet forever. In preparation, you plan to create a compressed copy of all the text on the Web, so that you can store it on a private server. Unfortunately, your private server has only one per cent of the space needed; you can’t use a lossless compression algorithm if you want everything to fit. Instead, you write a lossy algorithm that identifies statistical regularities in the text and stores them in a specialized file format. Because you have virtually unlimited computational power to throw at this task, your algorithm can identify extraordinarily nuanced statistical regularities, and this allows you to achieve the desired compression ratio of a hundred to one.

Now, losing your Internet access isn’t quite so terrible; you’ve got all the information on the Web stored on your server. The only catch is that, because the text has been so highly compressed, you can’t look for information by searching for an exact quote; you’ll never get an exact match, because the words aren’t what’s being stored. To solve this problem, you create an interface that accepts queries in the form of questions and responds with answers that convey the gist of what you have on your server.

What I’ve described sounds a lot like ChatGPT, or most any other large-language model. Think of ChatGPT as a blurry JPEG of all the text on the Web. It retains much of the information on the Web, in the same way that a JPEG retains much of the information of a higher-resolution image, but, if you’re looking for an exact sequence of bits, you won’t find it; all you will ever get is an approximation. But, because the approximation is presented in the form of grammatical text, which ChatGPT excels at creating, it’s usually acceptable. You’re still looking at a blurry JPEG, but the blurriness occurs in a way that doesn’t make the picture as a whole look less sharp.

This analogy to lossy compression is not just a way to understand ChatGPT’s facility at repackaging information found on the Web by using different words. It’s also a way to understand the “hallucinations,” or nonsensical answers to factual questions, to which large-language models such as ChatGPT are all too prone. These hallucinations are compression artifacts, but—like the incorrect labels generated by the Xerox photocopier—they are plausible enough that identifying them requires comparing them against the originals, which in this case means either the Web or our own knowledge of the world. When we think about them this way, such hallucinations are anything but surprising; if a compression algorithm is designed to reconstruct text after ninety-nine per cent of the original has been discarded, we should expect that significant portions of what it generates will be entirely fabricated.

This analogy makes even more sense when we remember that a common technique used by lossy compression algorithms is interpolation—that is, estimating what’s missing by looking at what’s on either side of the gap. When an image program is displaying a photo and has to reconstruct a pixel that was lost during the compression process, it looks at the nearby pixels and calculates the average. This is what ChatGPT does when it’s prompted to describe, say, losing a sock in the dryer using the style of the Declaration of Independence: it is taking two points in “lexical space” and generating the text that would occupy the location between them. (“When in the Course of human events, it becomes necessary for one to separate his garments from their mates, in order to maintain the cleanliness and order thereof. . . .”) ChatGPT is so good at this form of interpolation that people find it entertaining: they’ve discovered a “blur” tool for paragraphs instead of photos, and are having a blast playing with it.

Given that large-language models like ChatGPT are often extolled as the cutting edge of artificial intelligence, it may sound dismissive—or at least deflating—to describe them as lossy text-compression algorithms. I do think that this perspective offers a useful corrective to the tendency to anthropomorphize large-language models, but there is another aspect to the compression analogy that is worth considering. Since 2006, an A.I. researcher named Marcus Hutter has offered a cash reward—known as the Prize for Compressing Human Knowledge, or the Hutter Prize—to anyone who can losslessly compress a specific one-gigabyte snapshot of Wikipedia smaller than the previous prize-winner did. You have probably encountered files compressed using the zip file format. The zip format reduces Hutter’s one-gigabyte file to about three hundred megabytes; the most recent prize-winner has managed to reduce it to a hundred and fifteen megabytes. This isn’t just an exercise in smooshing. Hutter believes that better text compression will be instrumental in the creation of human-level artificial intelligence, in part because the greatest degree of compression can be achieved by understanding the text.

To grasp the proposed relationship between compression and understanding, imagine that you have a text file containing a million examples of addition, subtraction, multiplication, and division. Although any compression algorithm could reduce the size of this file, the way to achieve the greatest compression ratio would probably be to derive the principles of arithmetic and then write the code for a calculator program. Using a calculator, you could perfectly reconstruct not just the million examples in the file but any other example of arithmetic that you might encounter in the future. The same logic applies to the problem of compressing a slice of Wikipedia. If a compression program knows that force equals mass times acceleration, it can discard a lot of words when compressing the pages about physics because it will be able to reconstruct them. Likewise, the more the program knows about supply and demand, the more words it can discard when compressing the pages about economics, and so forth.

Large-language models identify statistical regularities in text. Any analysis of the text of the Web will reveal that phrases like “supply is low” often appear in close proximity to phrases like “prices rise.” A chatbot that incorporates this correlation might, when asked a question about the effect of supply shortages, respond with an answer about prices increasing. If a large-language model has compiled a vast number of correlations between economic terms—so many that it can offer plausible responses to a wide variety of questions—should we say that it actually understands economic theory? Models like ChatGPT aren’t eligible for the Hutter Prize for a variety of reasons, one of which is that they don’t reconstruct the original text precisely—i.e., they don’t perform lossless compression. But is it possible that their lossy compression nonetheless indicates real understanding of the sort that A.I. researchers are interested in?

Let’s go back to the example of arithmetic. If you ask GPT-3 (the large-language model that ChatGPT was built from) to add or subtract a pair of numbers, it almost always responds with the correct answer when the numbers have only two digits. But its accuracy worsens significantly with larger numbers, falling to ten per cent when the numbers have five digits. Most of the correct answers that GPT-3 gives are not found on the Web—there aren’t many Web pages that contain the text “245 + 821,” for example—so it’s not engaged in simple memorization. But, despite ingesting a vast amount of information, it hasn’t been able to derive the principles of arithmetic, either. A close examination of GPT-3’s incorrect answers suggests that it doesn’t carry the “1” when performing arithmetic. The Web certainly contains explanations of carrying the “1,” but GPT-3 isn’t able to incorporate those explanations. GPT-3’s statistical analysis of examples of arithmetic enables it to produce a superficial approximation of the real thing, but no more than that.

Given GPT-3’s failure at a subject taught in elementary school, how can we explain the fact that it sometimes appears to perform well at writing college-level essays? Even though large-language models often hallucinate, when they’re lucid they sound like they actually understand subjects like economic theory. Perhaps arithmetic is a special case, one for which large-language models are poorly suited. Is it possible that, in areas outside addition and subtraction, statistical regularities in text actually do correspond to genuine knowledge of the real world?

Tags: algorithmsartificial intelligence (a.i.)imagesInternettechnologywriting
Google Publishers

Google Publishers

Related Posts

MTV gong from the ‘80s sells for more than $15K at auction
Celebrities

MTV gong from the ‘80s sells for more than $15K at auction

by Google Publishers
April 1, 2023

Someone wanted their MTV, so they bought a piece of its history. A gong used throughout the ’80s on MTV...

‘Pop’ culture, international stars mark Hall class
Sports

‘Pop’ culture, international stars mark Hall class

by Google Publishers
April 1, 2023

Tony Parker and Pau Gasol played for him. Becky Hammon coached alongside him. Dirk Nowitzki and Dwyane Wade waged battles...

PS5 vs Xbox Series X|S Sales Comparison in the US – February 2023

PS5 vs Xbox Series X|S Sales Comparison in the US – February 2023

April 1, 2023
Next Post
What’s The Deal With Political Trifectas?

What’s The Deal With Political Trifectas?

Recommended

Dense breasts raise breast cancer risk, but many women aren’t aware — here’s what to know

Dense breasts raise breast cancer risk, but many women aren’t aware — here’s what to know

2 months ago
Interoperability is about more than compliance

Interoperability is about more than compliance

1 week ago

Popular News

  • Franz Liszt Consolation no. 3 in D flat Major, S. 172

    Franz Liszt Consolation no. 3 in D flat Major, S. 172

    0 shares
    Share 0 Tweet 0
  • DAKU INDERPAL MOGA CHANI NATTAN NEW PUNJABI SONG 2020 LATEST PUNJABI SONG 2021

    0 shares
    Share 0 Tweet 0
  • Old School Vibe – Eifi (Official Audio) – Slowed Reverb New Pakistan Punjabi Song

    0 shares
    Share 0 Tweet 0
  • Sab Kuch Murshad B Praak Jaani New Punjab Song 2022 🖤

    0 shares
    Share 0 Tweet 0
  • Gurnam Bhullar | Wakh Ho Jana | Main Viyah Nahi Karona Tere Naal | Sonam Bajwa | New punjabi song

    0 shares
    Share 0 Tweet 0

Connect with us

Facebook Twitter Youtube RSS

About Us

GOOGLE PUBLISHERS NETWORK is the largest local-to-national digital media organization in the country. Our national flagship brand, sits at the center of the NETWORK, surrounded by hundreds of local media properties reporting on the stories and cultural moments happening across America and in our communities.

Recent News

Two men shot dead in hail of bullets on Queens street: police

Two men shot dead in hail of bullets on Queens street: police

April 1, 2023
States responsible for implementing menstrual health schemes: Centre to SC

States responsible for implementing menstrual health schemes: Centre to SC

April 1, 2023
‘Succession’ star Nicholas Braun promises ‘bolder’ final season for Cousin Greg

‘Succession’ star Nicholas Braun promises ‘bolder’ final season for Cousin Greg

April 1, 2023

Site Links

  • About Us
  • Corrections & Clarifications
  • Ethical Principles
  • Privacy Policy
  • Terms & Conditions
  • Contact

© 2023 Google Publishers -

No Result
View All Result
  • World
    • Africa
    • China
    • Asia
    • Australia
    • India
      • राजनीति
      • राशिफल
    • Europe
    • Middle East
    • United Kingdom
  • Politics
  • Lifestyle
    • food
    • Fashion
    • Diet and Weight Loss
    • Mindfulness
    • Relationships
  • Entertainment
    • Celebrities
    • Gaming
    • Movie
    • Music
    • Television
  • Sports
  • Business
    • Market
    • Media
    • Perspectives
    • Success
    • Tech
    • Videos
  • Travel
    • Destinations
    • Food & Drinks
    • Stay
  • Style
    • Architecture
    • Arts
    • Beauty
    • Design
    • Luxury
  • Tech
    • AI World
    • Future
    • Gadget
    • Innovate
    • Innovative Cities

© 2023 Google Publishers -

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In