A tech report says Amália, a big language model, is doing better than other open AI models when it comes to European Portuguese. It looks like AMALIA-DPO is the best of the open models, crushing it in language and meaning stuff. Plus, it really gets how Portuguese is used in Portugal and the local culture.
Amália even got the best score of all the open-source models on some Portuguese tests with essay questions. The report says it's good at understanding tricky questions and writing clear, correct answers that sound natural. It's either the best or just as good as the top models at understanding language, figuring things out, and writing quality text. It also passes security tests.
Amália learned from data from arquivo.pt and some special datasets made for European Portuguese. They mixed language learning with instruction tuning. The team had a hard time because there weren't many ways to test PT-PT, so they had to make their own and translate a bunch of datasets with good machine translation.
A bunch of Portuguese universities are working on Amália with help from supercomputers in Portugal and Europe and it keeps getting better. Next up, they want to use reinforcement learning and new training data to make it even better at thinking in European Portuguese. The goal is to make it a dependable AI helper for people who use PT-PT.