Groups Similar Look up By Text Browse About



Similar articles
Article Id Title Prob Score Similar Compare
221003 ZDNET 2021-11-22:
The absurd beauty of hacking Nvidia's GauGAN 2 AI image machine
1.000 Find similar Compare side-by-side
220911 VENTUREBEAT 2021-11-22:
Nvidia’s latest AI tech translates text into landscape images
0.950 0.646 Find similar Compare side-by-side
220927 THEVERGE 2021-11-24:
WhatsApp on the web gets a built-in sticker maker
0.334 Find similar Compare side-by-side
220842 ARSTECHNICA 2021-11-24:
“NFT” picked as word of the year—deal with it
0.298 Find similar Compare side-by-side
221170 ZDNET 2021-11-25:
AI design changes on the horizon from open-source Apache TVM and OctoML
0.290 Find similar Compare side-by-side
220938 TECHREPUBLIC 2021-11-24:
How to use Bash associative arrays
0.287 Find similar Compare side-by-side
220891 ARSTECHNICA 2021-11-22:
In a first, scientists captured growth of butterfly wings inside chrysalis on video
0.276 Find similar Compare side-by-side
221008 VENTUREBEAT 2021-11-23:
Microsoft’s Tutel optimizes AI model training
0.264 Find similar Compare side-by-side
220828 VENTUREBEAT 2021-11-24:
The shape of edge AI to come
0.243 Find similar Compare side-by-side
221021 TECHREPUBLIC 2021-11-22:
How to generate random letters in Excel
0.242 Find similar Compare side-by-side
220937 VENTUREBEAT 2021-11-22:
How Nvidia aims to demystify zero trust security
0.238 Find similar Compare side-by-side
221132 ARSTECHNICA 2021-11-25:
Scientists use seismic noise to image first hundred meters of Mars
0.232 Find similar Compare side-by-side
220980 ARSTECHNICA 2021-11-23:
“Vulture bees” evolved a taste for flesh—and their microbiomes reflect that
0.231 Find similar Compare side-by-side
220723 THEVERGE 2021-11-19:
Discord now lets you send up to 10 (accessible) memes at a time
0.229 Find similar Compare side-by-side
220978 ARSTECHNICA 2021-11-23:
Math may have caught up with Google’s quantum-supremacy claims
0.227 Find similar Compare side-by-side
220850 VENTUREBEAT 2021-11-24:
Why fast, effective data labeling has become a competitive advantage (VB Live)
0.226 Find similar Compare side-by-side
220967 TECHREPUBLIC 2021-11-22:
3 ways to add color to the numbers in a numbered list in Word
0.225 Find similar Compare side-by-side
220943 VENTUREBEAT 2021-11-22:
Why enterprises should build AI infrastructure first
0.219 Find similar Compare side-by-side
220970 TECHREPUBLIC 2021-11-22:
How to install and use InVID, a plugin to debunk fake news and verify videos and images
0.217 Find similar Compare side-by-side
220826 THEVERGE 2021-11-23:
Elizabeth Holmes admitted to a key part of the case against her
0.215 Find similar Compare side-by-side
220839 ARSTECHNICA 2021-11-23:
Battlefield 2042 review: The future of warfare is meaningless
0.213 Find similar Compare side-by-side
221174 ZDNET 2021-11-24:
NSW government clamps down on apartment building defects using blockchain and AI
0.206 Find similar Compare side-by-side
220864 ZDNET 2021-11-24:
Need a mouth-watering Thanksgiving recipe? Ask AI
0.205 Find similar Compare side-by-side
220983 VENTUREBEAT 2021-11-19:
AI Weekly: Defense Department proposes new guidelines for developing AI technologies
0.202 Find similar Compare side-by-side
220734 ARSTECHNICA 2021-11-19:
New study debunks controversial 2015 fossil find: It’s not a four-limbed snake after all
0.202 Find similar Compare side-by-side

1

ID: 221003

URL: https://www.zdnet.com/article/the-absurd-beauty-of-hacking-nvidias-gaugan-2-ai-image-machine/

Date: 2021-11-22

The absurd beauty of hacking Nvidia's GauGAN 2 AI image machine

Typing nonsense phrases into Nvidia's algorithm produces some fascinating "errors," at times beautiful, at times wretched, in most cases fascinating. Typing the words "ZDNet superb reporting" into Nvidia's GauGAN 2 AI program automatically produces surreal images. Type the words "ZDNet superb reporting" into Nvidia's new artificial intelligence demo, GauGAN 2, and you will see a picture of what looks like large pieces of foam insulation wrestling in a lake against a snowy backdrop.  Add more words, such as "ZDNet superb reporting comely," and you'll see the image morphed into something new, some barely recognizable form, perhaps a Formula One race car that has been digested, proceeding along what looks sort-of like a road, in front of blurry views of a man-made structure.  GauGAN 2 produces a strange interpretation of the phrase "ZDNet superb reporting comely. " Roll the dice with a little button of an image of two dice, and you'll, and the same phrase becomes a spooky, mist-shrouded landscape with a yawning mouth of some sort of organic nature, but completely unidentifiable as to its exact species. Another roll of the dice produces this bizarre landscape-plus-creature. Typing phrases is the latest way to control GauGAN, an algorithm developed by graphics chip giant Nvidia to showcase the state of the art of AI. The original GauGAN program was introduced in early 2019 as a way to draw and have the program automatically generate a photo-realistic image by filling in the drawing. The term "GAN" in the name refers to a broad class of neural network programs, called generative adversarial networks, introduced in 2014 by Ian Goodfellow and colleagues. GANs use two neural networks operating at cross purposes, one producing output that it steadily refines until the second neural network labels the output valid. The competitive nature of the back and forth is why they're called "adversarial." Nvidia has done pioneering work extending GANs, including the introduction in 2018 of "Style-GAN," which made it possible to generate highly realistic fake photos of people. In that work, the neural network "learned" high-level aspects of faces and also low-level aspects, such as skin tone.  In the original GauGAN from 2019, Nvidia use a similar approach, letting one draw a landscape as areas, known as a segmentation map. Those high-level abstractions, such as lakes and rivers and fields, became a structural template, and the GauGAN program would then fill in the drawn segmentation map with real-world forms.  Version two of the program has been updated to handle language. The intention is that one will prompt GuaGAN 2 with sensible phrases, things pertaining to landscapes, such as "coast ripples cliffs." The GauGAN 2 program will respond by generating a realistic-looking scene that matches that input.  The program was developed in its "training" phase by being fed 10 million high-quality landscape images, says Nvidia, using the Selene supercomputer built from Nvidia GPUs. A segmentation map can also be created automatically, allowing one to go back and edit the layout of the landscape in the way the original GauGAN allowed one to create.  As Nvidia describes GauGAN 2 in a blog post, the combination of text and image and segmentation map is a break-through in multi-modality AI: GauGAN2 combines segmentation mapping, inpainting and text-to-image generation in a single model, making it a powerful tool to create photorealistic art with a mix of words and drawings. The demo is one of the first to combine multiple modalities -- text, semantic segmentation, sketch and style -- within a single GAN framework. This makes it faster and easier to turn an artist's vision into a high-quality AI-generated image. The practical benefit, says Nvidia, is that one can use a few words to get a basic image together without any drawing at all and then tweak details to refine the final output. But adding words that don't have anything to do with landscapes, such as "ZDNet," starts to generate crazy artefacts that have by times revolting freakishness, and by times appalling beauty -- depending on your taste. In the terminology of deep learning, the freakish images produced by nonsense phrases result from the program having to grapple with language that is "out of distribution," meaning not captured in the training data fed to the machine. Faced with irreconcilable phrases, the program is struggling to match an image to the phrase. As can be seen in a series of images, the "coast ripples cliffs" produces a very faithful image at first. Adding qualifiers with impertinent words -- bicycle, New York City, the name Cassandra -- starts to shift and shape the landscape in strange ways.  Automatic output by GauGAN2 of the phrase "coast ripples cliffs." Automatic output by GauGAN2 of the phrase "coast ripples cliffs bicycle New York Cassandra drill airplane wisely pneumatic ostentatious. " Even more interesting things happen when all the landscape words are removed, leaving only the nonsense. Strange, futuristic landscapes or multi-colored amoebae come into view. Automatic output by GauGAN2 for the phrase "Cassandra drill airplane wisely pneumatic ostentatious." Automatic output by GauGAN2 for the word "ostentatious." Automatic output by GauGAN2 for the word "ostentatious" Automatic output by GauGAN2 for the phrase "wisely pneumatic ostentatious." Automatic output by GauGAN2 for the phrase "wisely pneumatic ostentatious." The experiment can be taken even further with extended phrases that are suggestive without exactly being descriptive. Try feeding in the first line to T.S. Eliot's poem The Wasteland, "April is the cruellest month, breeding lilacs out of the dead land." The result is some striking images that are, in fact, somewhat appropriate. As one rolls the dice, many variants of appropriate landscapes emerge, with only slight artefacts in some cases. "April is the cruelest month, breeding lilacs out of the dead land," T.S. Eliot, The Wasteland. Thanks to the innovations of StyleGAN, GauGAN is able to apply a style to the image, to essentially condition the output to be in the form of some other image, rather like a mash-up.  The application of style to Eliot's poem distorts the faithful landscape images beyond recognition. Once again, a whole host of weird objects appear with a kind of sickening organic quality to some of them, others merely broken shards of what was once an image. One can also submit images and even draw on GauGAN 2. Submitting an old photograph taken at Þingvellir, the site of the ancient Icelandic parliament, didn't do much. The image remained mostly untransformed, in limited testing. A photo taken at Þingvellir, the site of the ancient Icelandic parliament, was mostly unaltered when submitted to GauGAN2. Adding the word "Þingvellir," however, produced a realistic-enough landscape that was in keeping with the Þingvellir site. GuaGAN2 output for the word "Þingvellir" was in the spirit of the ancient Icelandic landscape. Adding the word "volcano" produced a striking alternative landscape, less realistic, more surreal. GuaGAN2 automatic output for "Þingvellir volcano." Adding an impertinent word, such as "Technology," further shook up the landscape, adding strange nonsense figures.  GauGAN2 automatic output for the phrase "Þingvellir technology." Rather than submit a photo of a landscape, one can draw, as was the case in the original GauGAN. Again, choosing something, not in keeping with the demo, a drawing not of a landscape but of a person's head, produces more interesting results. The face is able to be re-skinned, if you will, by using the mash-up function. Rolling the dice produced interesting variations. drawing directly in GauGAN2. Drawing of a head re-skinned by using the layers feature in GauGAN2. Drawing of a head re-skinned by using the layers feature in GauGAN2.   Combining the drawing with the word "Þingvellir" produced subtle changes, as did adding additional words such as "volcano" and "rift." The image was re-skinned to have a kind of volcano-like texture.  Drawing of a head combined with the words "Þingvellir volcano rift" and re-skinned by using the layers feature in GauGAN2. Note that the user interface of the app can be hard to scroll in desktop browsers. For some reason, it seems to work better in a tablet browser, such as an iPad.