Groups Similar Look up By Text Browse About



Similar articles
Article Id Title Prob Score Similar Compare
220911 VENTUREBEAT 2021-11-22:
Nvidia’s latest AI tech translates text into landscape images
1.000 Find similar Compare side-by-side
221003 ZDNET 2021-11-22:
The absurd beauty of hacking Nvidia's GauGAN 2 AI image machine
0.950 0.646 Find similar Compare side-by-side
220927 THEVERGE 2021-11-24:
WhatsApp on the web gets a built-in sticker maker
0.339 Find similar Compare side-by-side
221170 ZDNET 2021-11-25:
AI design changes on the horizon from open-source Apache TVM and OctoML
0.333 Find similar Compare side-by-side
221008 VENTUREBEAT 2021-11-23:
Microsoft’s Tutel optimizes AI model training
0.331 Find similar Compare side-by-side
220850 VENTUREBEAT 2021-11-24:
Why fast, effective data labeling has become a competitive advantage (VB Live)
0.313 Find similar Compare side-by-side
220891 ARSTECHNICA 2021-11-22:
In a first, scientists captured growth of butterfly wings inside chrysalis on video
0.308 Find similar Compare side-by-side
220970 TECHREPUBLIC 2021-11-22:
How to install and use InVID, a plugin to debunk fake news and verify videos and images
0.282 Find similar Compare side-by-side
220983 VENTUREBEAT 2021-11-19:
AI Weekly: Defense Department proposes new guidelines for developing AI technologies
0.272 Find similar Compare side-by-side
220973 VENTUREBEAT 2021-11-23:
AI-powered voice transcription startup Verbit secures $250M
0.271 Find similar Compare side-by-side
220794 VENTUREBEAT 2021-11-19:
How Unity Simulation Pro and Unity SystemGraph use AI to train systems
0.266 Find similar Compare side-by-side
221132 ARSTECHNICA 2021-11-25:
Scientists use seismic noise to image first hundred meters of Mars
0.255 Find similar Compare side-by-side
220864 ZDNET 2021-11-24:
Need a mouth-watering Thanksgiving recipe? Ask AI
0.248 Find similar Compare side-by-side
220937 VENTUREBEAT 2021-11-22:
How Nvidia aims to demystify zero trust security
0.240 Find similar Compare side-by-side
220872 VENTUREBEAT 2021-11-21:
The metaverse will make your meetings worse
0.237 Find similar Compare side-by-side
220828 VENTUREBEAT 2021-11-24:
The shape of edge AI to come
0.234 Find similar Compare side-by-side
220960 TECHREPUBLIC 2021-11-22:
How to personalize the lock screen background image in Windows 11
0.231 Find similar Compare side-by-side
220842 ARSTECHNICA 2021-11-24:
“NFT” picked as word of the year—deal with it
0.227 Find similar Compare side-by-side
220888 VENTUREBEAT 2021-11-22:
Asynchronous video is set to transform the workplace, courtesy of effortless video tools
0.223 Find similar Compare side-by-side
220914 TECHREPUBLIC 2021-11-23:
How to identify social media misinformation and protect your business
0.217 Find similar Compare side-by-side
220706 TECHREPUBLIC 2021-11-19:
Policymakers want to regulate AI but lack consensus on how
0.210 Find similar Compare side-by-side
220723 THEVERGE 2021-11-19:
Discord now lets you send up to 10 (accessible) memes at a time
0.206 Find similar Compare side-by-side
220849 TECHREPUBLIC 2021-11-24:
Be wary of trusting algorithms in volatile markets
0.206 Find similar Compare side-by-side
220643 VENTUREBEAT 2021-11-19:
TruePlan brings data to companies’ headcount and hiring plans
0.205 Find similar Compare side-by-side
220949 VENTUREBEAT 2021-11-22:
Report: AI has assisted half of all business owners during the labor shortage
0.204 Find similar Compare side-by-side

1

ID: 220911

URL: https://venturebeat.com/2021/11/22/nvidias-latest-ai-tech-translates-text-into-landscape-images/

Date: 2021-11-22

Nvidia’s latest AI tech translates text into landscape images

Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022. Learn moreNvidia today detailed an AI system called GauGAN2, the successor to its GauGAN model, that lets users create lifelike landscape images that dont exist. Combining techniques like segmentation mapping, inpainting, and text-to-image generation in a single tool, GauGAN2 is designed to create photorealistic art with a mix of words and drawings. Compared to state-of-the-art models specifically for text-to-image or segmentation map-to-image applications, the neural network behind GauGAN2 produces a greater variety and higher-quality of images, Isha Salian, a member of Nvidias corporate communications team, wrote in a blog post. Rather than needing to draw out every element of an imagined scene, users can enter a brief phrase to quickly generate the key features and theme of an image, such as a snow-capped mountain range. This starting point can then be customized with sketches to make a specific mountain taller or add a couple of trees in the foreground, or clouds in the sky. GauGAN2, whose namesake is post-Impressionist painter Paul Gauguin, improves upon Nvidias GauGAN system from 2019, which was trained on more than a million public Flickr images. Like GauGAN, GauGAN2 has an understanding of the relationships among objects like snow, trees, water, flowers, bushes, hills, and mountains, such as the fact that the type of precipitation changes depending on the season. GauGAN and GauGAN2 are a type of system known as a generative adversarial network (GAN), which consists of a generator and discriminator. The generator takes samples — e.g., images paired with text — and predicts which data (words) correspond to other data (elements of a landscape picture). The generator is trained by trying to fool the discriminator, which assesses whether the predictions seem realistic. While the GANs transitions are initially poor in quality, they improve with the feedback of the discriminator. Unlike GauGAN, GauGAN2 — which was trained on 10 million images — can translate natural language descriptions into landscape images. Typing a phrase like sunset at a beach generates the scene, while adding adjectives like sunset at a rocky beach or swapping sunset to afternoon or rainy day instantly modifies the picture. With GauGAN2, users can generate a segmentation map — a high-level outline that shows the location of objects in the scene. From there, they can switch to drawing, tweaking the scene with rough sketches using labels like sky, tree, rock, and river and allowing the tools paintbrush to incorporate the doodles into images. GauGAN2 isnt unlike OpenAIs DALL-E, which can similarly generate images to match a text prompt. Systems like GauGAN2 and DALL-E are essentially visual idea generators, with potential applications in film, software, video games, product, fashion, and interior design. Nvidia claims that the first version of GauGAN has already been used to create concept art for films and video games. As with it, Nvidia plans to make the code for GauGAN2 available on GitHub alongside an interactive demo on Playground, the web hub for Nvidias AI and deep learning research. One shortcoming of generative models like GauGAN2 is the potential for bias. In the case of DALL-E, OpenAI used a special model — CLIP — to improve image quality by surfacing the top samples among the hundreds per prompt generated by DALL-E. But a study found that CLIP misclassified photos of Black individuals at a higher rate and associated women with stereotypical occupations like nanny and housekeeper. In its press materials, Nvidia declined to say how — or whether — it audited GauGAN2 for bias. The model has over 100 million parameters and took under a month to train, with training images from a proprietary dataset of landscape images. This particular model is solely focused on landscapes, and we audited to ensure no people were in the training images … GauGAN2 is just a research demo, an Nvidia spokesperson explained via email. GauGAN is one of the newest reality-bending AI tools from Nvidia, creator of deepfake tech like StyleGAN, which can generate lifelike images of people who never existed. In September 2018, researchers at the company described in an academic paper a system that can craft synthetic scans of brain cancer. That same year, Nvidia detailed a generative model thats capable of creating virtual environments using real-world videos. GauGANs initial debut preceded GAN Paint Studio, a publicly available AI tool that lets users upload any photograph and edit the appearance of depicted buildings, flora, and fixtures. Elsewhere, generative machine learning models have been used to produce realistic videos by watching YouTube clips, creating images and storyboards from natural language captions, and animating and syncing facial movements with audio clips containing human speech.