Page 1112 – Starthub Asia

Reflection 70B saga continues as training data provider releases post-mortem report

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More On September 5th, 2024, Matt Shumer, co-founder and CEO of the startup Hyperwrite AI (also known as OthersideAI) took to the social network X to post the bombshell news that he had fine-tuned a version of Meta’s open source Llama 3.1-70B into an even more performant large language model (LLM) known as Reflection 70B — so performant, in fact, based on alleged third-party benchmarking test results he published, that it was “the world’s top open-source model,” according to his post. I’m excited to announce Reflection 70B, the world’s top open-source model. Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes. 405B coming next week – we expect it to be the best model in the world. Built w/ @GlaiveAI. Read on ⬇️: pic.twitter.com/kZPW1plJuo — Matt Shumer (@mattshumer_) September 5, 2024 However, shortly after its release, third-party evaluators in the AI research and hosting community struggled to reproduce the claimed results, leading to accusations of fraud. Researchers cited discrepancies between the announced benchmark results and their independent tests, sparking a wave of criticism on social platforms such as Reddit and X. In response to these concerns, Shumer pledged he would conduct a review of the issues alongside Sahil Chaudhary, founder of Glaive, the AI startup whose synthetic data Shumer claimed he had trained Reflection 70B on — and which he later revealed to have invested what he called a small amount into. Now, nearly a month later, Chaudhary last night released a post-mortem report on his Glaive AI blog about the Reflection 70B model and published resources for the open-source AI community to test the model and his training process on their own. He says while he was unable to reproduce all of the same benchmarks, he “found a bug in the initial code,” resulting in several results appearing higher than what he has found on recent tests of Reflection 70B. However, other benchmark results appear higher than before — adding to the mystery. On September 5th, @mattshumer_ announced Reflection 70B, a model fine-tuned on top of Llama 3.1 70B, showing SoTA benchmark numbers, which was trained by me on Glaive generated data. Today, I’m sharing model artifacts to reproduce the initial claims and a post-mortem to address… — Sahil Chaudhary (@csahil28) October 2, 2024 As Chaudhary wrote in the post: “There were a lot of mistakes made by us in the way we launched the model, and handled the problems reported by the community. I understand that things like these have a significant negative effect on the open source ecosystem, and I’d like to apologize for that. I hope that this adds some clarity to what happened, and is a step in the direction of regaining the lost trust. I have released all of the assets required to independently verify the benchmarks and use this model.“ Sharing model artifacts To restore transparency and rebuild trust, Chaudhary shared several resources to help the community replicate the Reflection 70B benchmarks. These include: Model weights: Available on Hugging Face, providing the pre-trained version of Reflection 70B. Training data: Released for public access, enabling independent tests on the dataset used to fine-tune the model. Training scripts and evaluation code: Available on GitHub, these scripts allow for reproduction of the model’s training and evaluation process. These resources aim to clarify how the model was developed and offer a path for the community to validate the original performance claims. Reproducing the benchmarks In his post-mortem, Chaudhary explained that a major issue with reproducing the initial benchmark results stemmed from a bug in the evaluation code. This bug caused inflated scores in certain tasks, such as MATH and GSM8K, due to an error in how the system handled responses from an external API. The corrected benchmarks show slightly lower, but still strong, performance relative to the initial report. The updated benchmark results for Reflection 70B are as follows: MMLU: 90.94% GPQA: 55.6% HumanEval: 89.02% MATH: 70.8% GSM8K: 95.22% IFEVAL: 87.63% Compare that to the originally stated performance of: MMLU: 89.9% GPQA: 55.3% HumanEval: 91% MATH: 79.7% GSM8K: 99.2% IFEVAL: 90.13% Although the revised scores are not as high as those initially reported, Chaudhary asserts that they are more accurate reflections of the model’s capabilities. He also addressed concerns about dataset contamination, confirming that tests showed no significant overlap between the training data and benchmark sets. Reflecting on a hasty release Chaudhary admitted that the decision to release Reflection 70B was made hastily, driven by enthusiasm for the model’s performance on reasoning-based tasks. He noted that the launch lacked sufficient testing, particularly regarding the compatibility of the model files, and that he and Shumer had not verified whether the model could be easily downloaded and run by the community. “We shouldn’t have launched without testing, and with the tall claims of having the best open-source model,” Chaudhary wrote. He also acknowledged that more transparency was needed, especially regarding the model’s strengths and weaknesses. While Reflection 70B excels at reasoning tasks, it struggles in areas like creativity and general user interaction, a fact that was not communicated at launch. Clarifying API confusion One of the more serious accusations involved the suspicion that the Reflection 70B API was simply relaying outputs from Anthropic’s Claude model. Users reported strange behavior in the model’s outputs, including responses that seemed to reference Claude directly. Chaudhary addressed these concerns, explaining that although some of these behaviors were reproducible, he asserts there was no use of Claude APIs or any form of word filtering in the Reflection 70B model. He reiterated that the API was run on Glaive AI’s compute infrastructure, and Matt Shumer had no access to the code or servers used during this period. Looking ahead In closing, Chaudhary emphasized his commitment to transparency and expressed his hope that this post-mortem and the release of model artifacts will help restore trust in the project. He also confirmed that

Reflection 70B saga continues as training data provider releases post-mortem report Read More »

Apple Enters a New Era

Vectorize debuts agentic RAG platform for real time enterprise data

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More While vector databases are now increasingly commonplace as a core element of an enterprise AI deployment for Retrieval Augmented Generation (RAG), that’s not all that’s needed. Chris Latimer, the CEO and co-founder of startup Vectorize, spent several years working at DataStax where he helped to lead the database vendor’s cloud efforts. A recurring issue that he saw time and again was that the vector database wasn’t really the hard part of enabling enterprise RAG. The hard part of the problem was taking all the unstructured data and getting it into the vector database, in a way that was optimized and going to work well for generative AI. That’s why Latimer started Vectorize just ten months ago, in a bid to help solve that challenge. Today the company is announcing that it has raised $3.6 million in a seed round of funding, led by True Ventures. Alongside the funding, the company announced the general availability of its enterprise RAG platform. The Vectorize platform can enable an agentic RAG approach for near real-time data capability. Vectorize focuses on the data engineering side of AI. The platform helps companies prepare and maintain their data for use in vector databases and large language models. The Vectorize platform also enables enterprises to quickly build an RAG data pipeline through an intuitive interface. Another core capability is an RAG evaluation feature that allows enterprises to test different approaches. “We kept seeing people get to the end of the development cycle with their Gen AI projects and find out that they didn’t work really well,” Chris Latimer, co-founder and CEO of Vectorize told VentureBeat in an exclusive interview. “The context they were getting for their vector database wasn’t the most useful to the large language model, it was still hallucinating or it was misinterpreting the data.” How Vectorize fits into the enterprise RAG stack Vectorize is not a vector database itself. Rather, it’s a platform that connects unstructured data sources to existing vector databases like Pinecone, DataStax, Couchbase and Elastic. Latimer explained that Vectorize ingests and optimizes data from diverse sources for vector databases. The platform will provide a production-ready data pipeline that handles ingestion, synchronization, error handling and other data engineering best practices. Vectorize itself is not a vector embedding technology either. The process of converting data, be it text, images or audio into vectors, is what vector embedding is all about. Vectorize helps users evaluate different embedding models and data chunking methods to determine the best configuration for the enterprise’s specific use case and data. Latimer explained that Vectorize allows users to choose from any number of different embedding models. The different models could include for example OpenAI’s ada, or even Voyage AI embeddings, which are now being adopted by Snowflake. “We do take into account innovative ways to vectorize the data so that you get the best results,” Latimer said. “But ultimately, where we see the value is in giving enterprises and developers a production-ready solution that they just don’t have to worry about the data engineering side.” Using agentic AI to power enterprise RAG One of Vectorize’s key innovations is its “agentic RAG” approach. It’s an approach that combines traditional RAG techniques with AI agent capabilities, allowing for more autonomous problem-solving in applications. Agentic RAG isn’t a hypothetical concept either. It’s already being used by one of Vectorize’s early users, AI inference silicon startup Groq, which recently raised $640 million. Groq is using Vectorize’s agentic RAG capabilities to power an AI support agent. The agent can autonomously solve customer problems using the data and context provided by Vectorize’s data pipelines. “If a customer has a question that’s been asked and answered before, you want that agent to be able to solve the customer’s problem without a human getting involved,” Latimer said. “But if there’s something that the agent can’t solve, you do want to have a human in the loop where you can escalate, so this idea of being able to have an agent reason its way through solving a problem, is the whole idea behind an AI agent architecture.” Why real time data pipelines are essential to enterprise RAG A primary reason why an enterprise will use RAG is to connect to its own sources of data. What’s equally important though is making sure that data is up to date. “Stale data is going to lead to stale decisions,” Latimer said. Vectorize provides real-time and near-real-time data update capabilities, with the ability for customers to configure their tolerance for data staleness. “We’ve actually let people configure the platform based on their tolerance for stale data and their need for real-time data,” he said. “So if all you need is to schedule your pipeline to run once a week, we’ll let you do that, and then if you need to run real-time, we’ll let you do that as well, and you’ll have real-time updates as soon as they’re available.” source

Vectorize debuts agentic RAG platform for real time enterprise data Read More »

In A World With No Digital Destination, Get To Digital Mastery And Accelerate Growth With Forrester’s Digital Engine

Leave a Comment / Top Tech Update / Forrester

To outpace the pack and win in today’s market, firms must embrace the ubiquitous nature of digital as the competitive weapon and embed digital capabilities into the core of their business. Forrester’s report, Accelerate Business Growth With Forrester’s Digital Engine, shows how leaders can use components of Forrester’s Digital Engine to build digital mastery that drives top-line growth, operational effectiveness, and delivers value beyond customer conversions. Specifically, it focuses on how to: Set the core of your digital business in motion first by focusing on operations. Exceptional digital outcomes cannot be delivered, much less sustained, if the core of your business is weak. Focus first on nurturing digital talent, continuously calibrating your tech strategy, comprehensive planning, and robust measurement practices. This is the backbone to achieving digital mastery and making next-gen digital strategies a reality. Establish a continuously calibrated digital strategy in a world with no “digital destination.” As generative AI innovations add business value and customer expectations continue to evolve faster and faster, digital strategy, customer intelligence, and innovations must align and recalibrate often to deliver enhanced customer value, ahead of the pack. Deliver business results that transcend the status quo of today’s digital businesses. The beauty of the Digital Engine is, once it’s properly set in motion, it delivers consistent and stronger business results because it balances customer obsession, high-performance IT, and the AI advantage. Digital leaders are bringing new products and services that accrete revenue and accelerate business growth through this unique combination of evidence-backed research. Learn More For more information, read the report or connect with a Forrester analyst to engage in a fully functional assessment and an exploratory conversation to develop your unique digital strategy. source

In A World With No Digital Destination, Get To Digital Mastery And Accelerate Growth With Forrester’s Digital Engine Read More »

What’s Impacting Tech Buying in the Digital Economy

OpenAI will bring Cosmopolitan publisher Hearst’s content to ChatGPT

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Is the future of written media — and potentially imagery and videos, too — going to be primarily surfaced to us through ChatGPT? It’s not out of the question at the rate OpenAI is going. At the very least, the $157-billion dollar valued AI unicorn — fresh off the launch of its new Canvas feature for ChatGPT and a record-setting $6.6 billion fundraising round — is making damn well sure it has most of the leading U.S. magazine and text-based news publishers entered into content licensing agreements with it. These enable OpenAI to train on, or at least serve up, vast archives of prior written articles, photos, videos and other journalistic/editorial materials, through ChatGPT, SearchGPT and other AI products, potentially as truncated summaries. The latest major American media firm to join with OpenAI is Hearst, the eponymous media company famed for its “yellow journalism” founder William Randolph Hearst (who helped beat the drum for the U.S. to enter the Spanish-American War as well as demonized marijuana, and was memorably fictionalized by Citizen Kane‘s Charles Foster Kane) which is now perhaps best known as the publisher of Cosmopolitan, the sex and lifestyle magazine aimed at young women, as well as Esquire, Elle, Car & Driver, Country Living, Good Housekeeping, Popular Mechanics and many more. In total, Hearst operates 25 brands in the U.S., 175 websites and more than 200 magazine editions worldwide, according to its media page. However, OpenAI will be specifically surfacing “curated content” from more than 20 magazine brands and over 40 newspapers, including well-known titles such as Cosmopolitan, Esquire, Houston Chronicle, San Francisco Chronicle, ELLE, and Women’s Health. The content will be clearly attributed, with appropriate citations and direct links to Hearst’s original sources, ensuring transparency, according to the brands. “Hearst’s other businesses outside of magazines and newspapers are not included in this partnership,” reads a release jointly published on Hearst’s and OpenAI’s websites. It’s unclear whether or not the company will be training its models specifically on Hearst content — or merely piping said content through to end users of ChatGPT and other products. I’ve reached out to an OpenAI spokesperson for clarity and will update when I hear back. Hearst now joins the long and growing list of media publishers that have struck content licensing deals with OpenAI. Among the many that have forged deals with OpenAI include: These partnerships represent OpenAI’s broader ambition to collaborate with established media brands and elevate the quality of content provided through its AI systems. With Hearst’s integration, OpenAI continues to expand its network of trusted content providers, ensuring users of its AI products, like ChatGPT, have access to reliable information across a wide range of topics. What the executives are saying it means Jeff Johnson, President of Hearst Newspapers, emphasized the critical role that professional journalism plays in the evolution of AI. “As generative AI matures, it’s critical that journalism created by professional journalists be at the heart of all AI products,” he said, underscoring the importance of integrating trustworthy, curated content into these platforms. Debi Chirichella, President of Hearst Magazines, echoed this sentiment, noting that the partnership allows Hearst to help shape the future of magazine content while preserving the credibility and high standards of the company’s journalism. These deals signal a growing trend of cooperation between tech companies and traditional publishers as both industries adapt to the changes brought about by advances in AI. While OpenAI’s partnerships offer media companies access to cutting-edge technology and the opportunity to reach larger audiences, they also raise questions about the long-term impact on the future of publishing. Some critics argue that licensing content to AI platforms could potentially lead to competition, as AI systems improve and become more capable of generating content that rivals traditional journalism. I myself, as a journalist whose work was undoubtedly scraped and trained by many AI models (and used for lots of other things of which I had no control over or say in), voiced my own hesitation about media publishers moving so quickly to ink deals with OpenAI. These concerns were amplified in recent legal actions, such as the lawsuit filed by The New York Times against OpenAI and Microsoft, alleging copyright infringement in the development of AI models. The case remains in court for now, and NYT remains one of an increasingly few holdouts who have yet to settle with or strike a deal with OpenAI to license their content. Despite these concerns, publishers like Hearst, Condé Nast, and Vox Media are actively embracing AI as a means of staying competitive in an increasingly digital landscape. As Chirichella pointed out, Hearst’s partnership with OpenAI is not only about delivering their high-quality content to a new audience but also about preserving the cultural and historical context that defines their publications. This collaboration, she said, “ensures that our high-quality writing and expertise, cultural and historical context and attribution and credibility are promoted as OpenAI’s products evolve.” For OpenAI, these partnerships with major media brands enhance its ability to deliver reliable, engaging content to its users, aligning with the company’s stated goal of building AI products that provide trustworthy and relevant information. As Brad Lightcap, COO of OpenAI, explained, bringing Hearst’s content into ChatGPT elevates the platform’s value to users, particularly as AI becomes an increasingly common tool for consuming and interacting with news and information. source

OpenAI will bring Cosmopolitan publisher Hearst’s content to ChatGPT Read More »

The Future Of Network APIs

Leave a Comment / Top Tech Update / Forrester

Many of my recent client conversations have been on network APIs, also known as network open APIs. In particular, they want to discuss timelines, challenges, opportunities, use cases, and future market outlook. Similar to what I discussed in my previous blog about 5G network slicing, the reality is not as rosy as one might expect, once again due to the heavy dependency on software development communities outside the telecommunications industry’s reach. Why are people excited? In principle, network APIs allow carriers to provide network information or invoke network actions using RESTful open APIs, creating a much-welcomed monetization stream for telcos. Its success, however, relies upon a pervasive ecosystem and significant involvement from software developers. Vendors and telcos alike seek to solve for the decade-long shortage of application engineering knowledge within the telco ecosystem. What will be its future? I’ve organized my views across a timeline and major advancements expected in each period. Short-Term: 1 to 2 years This period will be mainly about experimentation, with a few point examples appearing in industries such as financial and commerce (e.g., anti-fraud, know-your-customer [KYC], identify verification, etc.). Although many telco-driven alliances trying to promote the ease and scale of consumption will form, a rapport with the software development community won’t be achieved. Software developers won’t include these APIs in their software development lifecycles (SDLCs) yet. Midterm: 3 to 5 years The first software alliance promoting adoption is formed. A true evangelization of a network-aware SDLC starts in this phase. An initial adoption is achieved, with the software development community understanding the value stream and starting to use some network APIs. Application design, engineering, and testing efforts are well understood, as well as the potential security and privacy risks. Some network APIs (e.g., quality on demand) won’t be used widely yet due to the parallel development of 5G stand-alone mode (SA) networks and devices with slicing capabilities. Long-Term: 5+ years Network APIs become common and play a key role in most application SDLCs. Proliferation of APIs and use cases is expected. A network-aware SDLC is a recognized practice within the software development community. Quality on-demand APIs are consolidated based on ubiquitous 5G SA slicing and roaming availability. Initial experimentation with APIs for 6G networks starts in this period. These APIs will provide new information such as network sensing data to be used by applications. Do you concur with my view? Have you launched a commercial application using network APIs? I would love to hear your feedback. Need more guidance? Engage with me via an inquiry call by emailing [email protected]. source

The Future Of Network APIs Read More »

Skills, AI and the Enterprise: Three Strategies for the Road Ahead

Foxconn to build Taiwan’s fastest AI supercomputer with Nvidia Blackwell

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Nvidia and Foxconn are building Taiwan’s largest supercomputer using Nvidia Blackwell chips. The project, Hon Hai Kaohsiung Super Computing Center, revealed Tuesday at Hon Hai Tech Day, will be built around Nvidia’s Blackwell graphics processing unit (GPU) architecture and feature the GB200 NVL72 platform, which includes a total of 64 racks and 4,608 Tensor Core GPUs. With an expected performance of over 90 exaflops of AI performance, the machine would easily be considered the fastest in Taiwan. Foxconn plans to use the supercomputer, once operational, to power breakthroughs in cancer research, large language model development and smart city innovations, positioning Taiwan as a global leader in AI-driven industries. Foxconn’s “three-platform strategy” focuses on smart manufacturing, smart cities and electric vehicles. The new supercomputer will play a pivotal role in supporting Foxconn’s ongoing efforts in digital twins, robotic automation and smart urban infrastructure, bringing AI-assisted services to urban areas like Kaohsiung. Construction has started on the new supercomputer housed in Kaohsiung, Taiwan. The first phase is expected to be operational by mid-2025. Full deployment is targeted for 2026. The project will integrate with Nvidia technologies, such as Nvidia Omniverse and Isaac robotics platforms for AI and digital twins technologies to help transform manufacturing processes. Nvidia is providing Blackwell AI chips to Foxconn for a new supercomputer. “Powered by Nvidia’s Blackwell platform, Foxconn’s new AI supercomputer is one of the most powerful in the world, representing a significant leap forward in AI computing and efficiency,” said Foxconn vice president James Wu, in a statement. The GB200 NVL72 is a state-of-the-art data center platform optimized for AI and accelerated computing. Each rack features 36 Nvidia Grace CPUs and 72 Nvidia Blackwell GPUs connected via Nvidia’s NVLink technology, delivering 130TB/s of bandwidth. Nvidia NVLink Switch allows the 72-GPU system to function as a single, unified GPU. This makes it ideal for training large AI models and executing complex inference tasks in real time on trillion-parameter models. Taiwan-based Foxconn, officially known as Hon Hai Precision Industry Co., is the world’s largest electronics manufacturer, known for producing a wide range of products, from smartphones to servers, for the world’s top technology brands. Foxconn is building digital twins of its factories using Nvidia Omniverse, and Foxconn was also one of the first companies to use Nvidia NIM microservices inthe development of domain-specific large language models, or LLMs, embedded into a variety of internal systems and processes in its AI factories for smart manufacturing, smart electric vehicles and smart cities. source

Foxconn to build Taiwan’s fastest AI supercomputer with Nvidia Blackwell Read More »

Upwork vs. Fiverr: Which freelance job site is best?

Leave a Comment / Top Tech Update / ZDNET

On its website, Fiverr proudly boasts that a gig is purchased every four seconds. It is a service largely known for its logo design, with over 50 million transactions sold on its platform to date. There is also major name recognition with clients that include Facebook, Google, Netflix, and PayPal. Jobs are available for freelancers in several areas like graphic design, digital marketing, writing & translation, music & audio, programming & tech, lifestyle, data, video & animation, and business. Website design, copywriting, SEO, and illustration are all popular services that you will see listed on the platform. However, there are tons of categories to choose from. Clients can use a comprehensive search tool to find the right freelancer for their needs, or you can be proactive and simply add your own products or services for sale. It is free to sign up, and after creating your seller profile, you can begin to create gigs or packages that show off your skills and attract employers. There are four seller levels for every freelancer using Fiverr: New Seller: Inexperienced sellers new to Fiverr begin here. Level 1 seller: A seller must successfully complete 10 highly-rated gigs and have an active Fiverr account for a minimum of 60 days. Level 2 seller: A seller must successfully complete 50 highly-rated, on-time gigs and have an active Fiverr account for a minimum of 120 days. Top-rated seller: A seller must successfully complete 100 highly-rated, on-time gigs for a minimum of $20 000 and have an active Fiverr account for a minimum of 180 days. Once you post your profile and your jobs or services, you are ready to go. When a client purchases a project, you get a notification to begin work. New to the service is Fiverr business, which is created for teams and allows businesses to work with experienced freelancers. If you’re a brand-new freelancer, there is no need to worry, either. Fiverr has excellent resources to help you grow, like the on-demand tutorial Learn from Fiverr. For simpler bookkeeping, it integrates with the AND CO app, which allows for proposals and task management, in addition to invoicing. Fiverr skips the hourly rates, instead opting for flat-rate gigs and projects. The platform offers the capability for projects that range in price from $5 to $10 000. source

Upwork vs. Fiverr: Which freelance job site is best? Read More »

Reflection 70B saga continues as training data provider releases post-mortem report

Apple Enters a New Era

Vectorize debuts agentic RAG platform for real time enterprise data

In A World With No Digital Destination, Get To Digital Mastery And Accelerate Growth With Forrester’s Digital Engine

What’s Impacting Tech Buying in the Digital Economy

OpenAI will bring Cosmopolitan publisher Hearst’s content to ChatGPT

The Future Of Network APIs

Skills, AI and the Enterprise: Three Strategies for the Road Ahead

Foxconn to build Taiwan’s fastest AI supercomputer with Nvidia Blackwell

Upwork vs. Fiverr: Which freelance job site is best?

We provide a matching platform and membership services for startup groups in Asia

Useful Links

Become an Affiliate

Contact

News & Insight

Join the family!

Latest News

Samsung Inks Deal To End Neonode Smartphone's Patent Suit

The Innovation Advantage: Private Market Investing