Изменения

Llama.cpp (править)

Версия от 05:44, 30 июля 2025

489 байт добавлено , 1 месяц назад

Строка 25: Строка 25:

Разработка llama.cpp началась в марте 2023 года Георгием Гергановым как реализация кода инференса [[Llama]] на [[Чистый C|чистом C]]/[[Чистый C++|C++]] без [[Зависимости (программирование)|зависимостей]].

−

Это резко повысило производительность на компьютерах без графического процессора или другого выделенного оборудования, что и было целью проекта<ref name="register-llamafile" /><ref name="arstechnica">{{cite web |last1=Edwards |first1=Benj |title=You can now run a GPT-3-level AI model on your laptop, phone, and Raspberry Pi |url=https://arstechnica.com/information-technology/2023/03/you-can-now-run-a-gpt-3-level-ai-model-on-your-laptop-phone-and-raspberry-pi/ |website=arstechnica.com |date=13 March 2023 |access-date=15 April 2024}}</ref><ref>{{cite ~~web~~ |title=~~Democratizing AI with open~~-~~source~~ language models |~~url~~=~~https://lwn~~.~~net/Articles/931853~~/ |~~website~~=~~lwn.net~~ |~~access-date~~=~~28 July 2024~~}}</ref>

+

Это резко повысило производительность на компьютерах без графического процессора или другого выделенного оборудования, что и было целью проекта<ref name="register-llamafile" /><ref name="arstechnica">{{cite web |last1=Edwards |first1=Benj |title=You can now run a GPT-3-level AI model on your laptop, phone, and Raspberry Pi |url=https://arstechnica.com/information-technology/2023/03/you-can-now-run-a-gpt-3-level-ai-model-on-your-laptop-phone-and-raspberry-pi/ |website=arstechnica.com |date=13 March 2023 |access-date=15 April 2024}}</ref><ref name="Wiest">{{cite journal |last1=Wiest |first1=Isabella Catharina |last2=Ferber |first2=Dyke |last3=Zhu |first3=Jiefu |last4=van Treeck |first4=Marko |last5=Meyer |first5=Meyer, Sonja K. |last6=Juglan |first6=Radhika |last7=Carrero |first7=Zunamys I. |last8=Paech |first8=Daniel |last9=Kleesiek |first9=Jens |last10=Ebert |first10=Matthias P. |last11=Truhn |first11=Daniel |last12=Kather |first12=Jakob Nikolas |title=Privacy-preserving large language models for structured medical information retrieval |journal=npj Digital Medicine |date=2024 |volume=7 |issue=257 |page=257 |doi=10.1038/s41746-024-01233-2|pmid=39304709 |pmc=11415382 }}</ref>

llama.cpp завоевала популярность у пользователей, не имеющих специализированного оборудования, поскольку могла работать только на CPU в том числе на устройствах [[Android]]<ref name="arstechnica" /><ref name="mozilla-introducing-llamafile">{{cite web |last1=Hood |first1=Stephen |title=llamafile: bringing LLMs to the people, and to your own computer |url=https://future.mozilla.org/builders/news_insights/introducing-llamafile/ |website=Mozilla Innovations |access-date=28 July 2024 |language=en}}</ref><ref>{{cite web |title=Democratizing AI with open-source language models |url=https://lwn.net/Articles/931853/ |website=lwn.net |access-date=28 July 2024}}</ref> . Изначально проект разрабатывался для CPU, но позже была добавлена поддержка инференса на GPU<ref name="Rajput">{{cite book |last1=Rajput |first1=Saurabhsingh |last2=Sharma |first2=Tushar |chapter=Benchmarking Emerging Deep Learning Quantization Methods for Energy Efficiency |title=2024 IEEE 21st International Conference on Software Architecture Companion (ICSA-C) |date=4 June 2024 |pages=238–242 |doi=10.1109/ICSA-C63560.2024.00049|isbn=979-8-3503-6625-9 }}</ref>.

In.wiki

autopatrolled, Бюрократы, Проверяющие участников, honadmin, honbureaucrat, importer, Администраторы интерфейса, interwiki, Редакторы модулей Lua, oversight, patroller, Администраторы (Semantic MediaWiki), Кураторы (Semantic MediaWiki), Editors (Semantic MediaWiki), steward, Скрывающие, Администраторы, редакторы кампании Мастера загрузок

1906

правок