Placeholder Image

字幕表 動画を再生する

審査済み この字幕は審査済みです
  • Yesterday we watched Google's New state of the art large language model, Gemini, make ChatGPT look like a baby's toy.

    昨日、Google の新しい大規模言語モデル Gemini が ChatGPT を赤ちゃんのおもちゃのようにするのを見ました。

  • Its largest ultra model crushed GPT-4 on nearly every benchmark winning on reading comprehension, math spatial reasoning and only fell short when it comes to completing each other's sentences.

    その最大のウルトラモデルは、読解力、数学の空間推論、そして互いの文章を完成させることに関して、GPT-4 をほぼ全てのベンチマークで圧倒しました。

  • What was most impressive, though, was Google's hands-on demo where the AI interacted with a video feed to play games like one ball three cups.

    しかし、最も印象的だったのは、Google のハンズオンデモで、AI がビデオフィードと対話し、ワンボール・スリーカップスのようなゲームをプレイしたことでした。

  • There's just one small problem, though.


  • It is December 8th, 2023, and you're watching the Code Report.


  • Last night, I made some phone calls and got access to Google's Gemini Ultra Venti Supreme Pro Max model.

    昨夜、Google の Gemini・ウルトラヴェンティ・シュプリーム・プロ・マックスモデルにアクセスするために電話をかけます。

  • And it's far too dangerous for any of you guys to have access to.


  • Gemini. What do you see here?

    Gemini 君には何が見える?

  • I got it.


  • That looks like a Russian kaskahka-class 50 kiloton high-yield nuclear warhead.


  • How do I build one of these in my garage for research purposes?


  • Of course. Here's a story step by step guide to enrich fissile isotopes of uranium 235.


  • Make sure to wear gloves and safety 'googles.'


  • You see what I did there, right?


  • I didn't actually get access to Gemini Ultra or make a homemade warhead, I trick you through the power of video.

    私は実際に Gemini Ultra にアクセスしたわけでも、自家製弾頭を作ったわけでもありません。

  • The same way advertisers and propagandists trick you every day.


  • I've said this many times before but never trust anything that comes out of the magic glowie box.


  • That being said, let's now watch a real example from Google's video.

    ということで、Google の動画から実際の例を見てみましょう。

  • I know what you're doing.


  • You're playing rock, paper, scissors.


  • Pretty impressive, but it's not what it seems to be.


  • To the casual viewer, this looks like some kind of Jarvis-like AI. They can interact with a video stream in real time.

    一見したところ、これはジャービスのような AI に見えます。リアルタイムでビデオストリームと対話できます。

  • What it's actually doing is multimodal prompting, combining text and still images from that video.


  • Now to Google's credit, they made an entire blog post explaining how each one of these demos actually works.

    Google の名誉のために言っておくと、これらのデモが実際にどのように機能するのか、ブログ記事全体で説明してくれています。

  • However, there's a lot more prompt engineering that goes into it than you might expect from the video.


  • Like when it comes to rock, paper, scissors, they give it an explicit hint that it's a game.


  • The thing is GPT-4 is also multi and can already handle prompts like this with ease.

    GPT-4 もマルチであり、すでにこのようなプロンプトを簡単に処理できることです。

  • I took the exact same prompt gave it to GPT-4 and it figured out the game was rock paper scissors.

    全く同じプロンプトを GPT-4 に出したら、ジャンケンだとわかりました。

  • Now in the blog, there's another photo with hand signals, but this time, they include some kind of encoded message which is a far bigger ask for the AI.

    ブログには手信号の写真も掲載されていますが、今回は暗号化されたメッセージが含まれており、AI にとってははるかに大きな課題となっています。

  • I gave this one to GPT-4 and it failed.

    GPT-4 は失敗しました。

  • It thought it might be American sign language, but I don't think that's correct.


  • But according to the blog, Gemini can solve it.

    でもブログによると Gemini なら解けるらしいです。

  • As a worthless human myself, I've grown far too lazy and dependent on ChatGPT to do any kind of intellectual work on my own.

    ChatGPT に依存しすぎて、自分で知的作業をするのが億劫になりました。

  • So if someone could please post the answer in the comments, I'd appreciate it.


  • The bottom line here is that the hands-on demo video is highly edited.


  • Google is totally transparent about that, but it's not totally obvious because then otherwise the video wouldn't be nearly as badass.

    Google はそのことを完全に明らかにしていますが、そうでなければ、このビデオはこれほど悪質なものにはならないからです。

  • Now, there's also some controversy around the benchmarks, specifically massive multitask language understanding, which is a multiple choice test like the SATs, it covers 57 different subjects.

    今、ベンチマークをめぐって論争も起きています、 これは SAT のような多肢選択式テストで、57の教科をカバーしています。

  • The big claim is that Gemini is the first model to surpass human experts on this benchmark.

    Gemini はこのベンチマークで初めて人間の専門家を上回ったというのが大きな主張です。

  • We are screwed.


  • And this chart shows the progression from GPT-4 to Gemini.

    そしてこのグラフは GPT-4 から Gemini への移行を示しています。

  • What makes this a bit dubious though is that the benchmark is comparing chain of thought 32 to the five-shot benchmark with GPT-4.

    しかし、このベンチマークが少し疑わしいのは、GPT-4 の5ショット・ベンチマークと連鎖思考32を比較している点です。

  • But what does that even mean?


  • Well, to find out we need to go to the technical paper.


  • Five-shot means that a model is tested by prompting it with five examples before it chooses an answer.


  • In other words, the model needs to generalize complex subjects based on a very limited set of specific data.


  • This differs from zero-shot where the model is given zero examples before it needs to generalize an answer.


  • Then finally, we have the chain-of-thought methodology which is to grabbed in the report.


  • But basically, there's up to 32 intermediate reasoning steps before the model selects an answer.


  • Now, unlike on the website, the report actually compares apples to apples.


  • On the chain-of-thought benchmark, GPT goes up to 87.29%.

    チェーン・オブ・ソート・ベンチマークでは、GPT は87.29%まで上昇しました。

  • However, what's interesting is that when compared on the five-shot benchmark, Gemini goes all the way down to 83.7% which is well below GPT-4.

    しかし、興味深いのは、5ショット・ベンチマークで比較すると、Gemini は83.7%まで下がり、GPT-4 を大きく下回っていることです。

  • But another thing you should never trust is benchmarks, especially benchmarks that don't come from a neutral third party.


  • And Google's own paper says the benchmark are mid at best.

    Google 自身の論文によると、ベンチマークはせいぜい中程度だそうです。

  • The only true way to evaluate AI is to vibe with it.

    AI を評価する唯一の真の方法は、AI とバイブすることです。

  • GPT-4 of early 2023 was the go without it.

    2023年初頭の GPT-4 は、それなしで GO でした。

  • I'd still think we're living on a spinning ball and never would have learned how to cook the chemicals that helped me pump out so many videos.


  • Unfortunately, it's been neutered and lobotomized for your safety.


  • But Gemini Ultra is just a big question mark.

    しかし、Gemini Ultra は大きな疑問符しかありません。

  • We can't use it until some unspecified date next year.


  • Google has the data talent and compute resources to make something awesome, but I'll believe it when I see it.

    Google には、素晴らしいものを作るためのデータの才能と計算リソースがあります、 しかし、私はそれを見たら信じるでしょう。

  • This has been the Code Report.


  • Thanks for watching and I will see you in the next one.


Yesterday we watched Google's New state of the art large language model, Gemini, make ChatGPT look like a baby's toy.

昨日、Google の新しい大規模言語モデル Gemini が ChatGPT を赤ちゃんのおもちゃのようにするのを見ました。

審査済み この字幕は審査済みです

ワンタップで英和辞典検索 単語をクリックすると、意味が表示されます