OpenAI launches audio features that read text and clone human voices

OpenAI decided not to roll out the feature more broadly and briefed reporters on the situation earlier this month.

OpenAI is sharing early results from a test of a feature that reads words with a convincing human voice, highlighting a new frontier in artificial intelligence and raising concerns about the risk of deepfakes.

The company is sharing early demonstrations and use cases of a small-scale preview of a text-to-speech model called Speech Engine, which it has shared with about 10 developers so far, a spokesperson said.

OpenAI decided not to roll out the feature more broadly and briefed reporters on the situation earlier this month.

A spokesperson for OpenAI said the company decided to scale back the release after receiving feedback from stakeholders including policymakers, industry experts, educators and creatives. The company initially planned to release the tool to up to 100 developers through an application process, according to an earlier press release.

“We recognize that generating speech that resembles people’s voices carries serious risks, which is especially important in an election year,” the company wrote in a blog post on Friday. “We are working with people from government, media, entertainment, education, U.S. and international partners in areas such as civil society are working together to ensure we incorporate their feedback as we build.”

Other AI techniques have been used to fake voices in some situations. In January, a fake but realistic-sounding phone call claiming to be from President Joe Biden encouraged people in New Hampshire not to vote in the primary election — an incident that sparked concerns about artificial intelligence ahead of crucial global elections. Intelligent fear.

Unlike OpenAI’s previous efforts to generate audio content, the Speech Engine can create speech that sounds like an individual, with a specific rhythm and intonation. The software only needs to record 15 seconds of audio of a person speaking to recreate their voice.

During a demonstration of the tool, Bloomberg listened to a brief clip of OpenAI CEO Sam Altman explaining the technology, sounding indistinguishable from his actual speech but entirely artificial intelligence Generated.

“If you have the right audio settings, it’s basically human-level sound,” said Jeff Harris, OpenAI’s head of product. “It’s very impressive technical quality.” However, Harry “Obviously there are a lot of safety issues with the ability to really accurately imitate human speech,” Si said.

One of OpenAI’s current development partners using the tool, nonprofit health system Lifespan’s Norman Prince Neuroscience Institute, is using technology to help patients regain their voices. For example, the company’s blog post said the tool was used to restore the voice of a young patient who had lost the ability to speak clearly due to a brain tumor by replicating a speech she had previously recorded for a school project.

OpenAI’s custom speech model can also translate the audio it generates into different languages. This makes it very useful for companies in the audio industry, such as Spotify Technology SA. Spotify is already using the technology in its own pilot program to translate podcasts from popular hosts like Lex Fridman. OpenAI also touts other beneficial applications of the technology, such as creating a wider range of voices for educational content for children.

In the beta program, OpenAI requires its partners to agree to its usage policy, obtain consent before using the original speaker’s voice, and disclose to listeners that the sounds they hear are AI-generated. The company also installed an inaudible audio watermark to allow it to distinguish whether a piece of audio was created by its tool.

OpenAI said it is seeking feedback from outside experts before deciding whether to release the feature more broadly. “It’s important that people around the world understand where this technology is headed, whether or not we ultimately deploy it broadly,” the company said in a blog post.

OpenAI also wrote that it hopes the preview of its software will “ignite the need for increased social resilience” to the challenges posed by more advanced AI technologies. For example, the company is calling on banks to phase out voice authentication as a security measure for accessing bank accounts and sensitive information. It also seeks public education about deceptive AI content and the development of more technology to detect whether audio content is real or AI-generated.

(Except for the headline, this story has not been edited by NDTV staff and is published from a syndicated feed.)

OpenAI launches audio features that read text and clone human voices

Latest posts

Rahul Dravid’s decision on head coach post, Indian icon unlikely to apply again: Report cricket news

Tesla faces opposition at German Gigafactory over planned expansion

Chinese EV maker Zeekr is up nearly 35% in its US market debut

Another Indian-Origin Man Arrested In Canada’s Biggest Gold, Cash Heist

Massive Tesla layoffs threaten to slow Biden’s plan to electrify highways

“No Tolerance For…”: Pak PM On Huge Protests In Pakistan-Occupied Kashmir

Bihar Poll Officer’s Clarification On Mallikarjun Kharge’s Chopper Searched Claim

Hoax Bomb Scare Grips Delhi Hospitals, Airport After Threats To Schools

BJP vs Trinamool As Video Claims Sandeshkhali Women Were Paid To Protest

Rahul Dravid’s decision on head coach post, Indian icon unlikely to apply again: Report cricket news

Shooters Isha Singh, Anish Bhanwala register second win in Olympic selection trials. shooting news

“We are in the last phase of the intense training block”: Harmanpreet Singh ahead of Paris Olympics

Google I/O 2024: Here’s what’s coming from Android 15, Gemini AI to Pixel Fold 2 and more

Best Airtel postpaid plans with price, data benefits, offers and more

How to send WhatsApp message without saving number