肌萎缩性侧索硬化症 (ALS) 是一种严重的疾病,会使患者失去移动和说话的能力。在高中的暑假期间,我曾在 ALS 协会做志愿者,发现有些患者只能通过眼睛和辅助技术进行交流,但这类技术存在许多限制,例如费用和效率。我的免费多语言应用“Gaze Link”采用 Google Gemini API 为依托,可帮助 ALS 患者独立、准确、高效地通过眼睛进行交流。
首先,我会使用 Google ML Kit 和 OpenCV 识别用户的脸部和眼睛。经过 30 秒的校准和设置调整后,用户就可以开始使用 6 种眼动手势在 Gaze Link 的多语言键盘上输入字词。不过,对于长句子,用眼睛输入可能非常缓慢。
为了提高文本输入率,我使用 Gemini 1.5 Flash 模型根据关键字和上下文生成患者想要的句子。首先,Gaze Link 会将护理人员的语音转写为文字,例如“室温合适吗?”。然后,患者会用眼睛输入“热、空调、两”等关键字。Gemini 模型会使用这些信息在一秒钟内生成合适的句子,例如“我很热,能把空调调低 2 度吗?”该设备和键盘还支持西班牙语和中文。对 30 名用户进行的实验表明,该模型最多可减少 85% 的用户按键操作,使 Gaze Link 的效率比传统的电子转账板高出 7 倍。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["没有我需要的信息","missingTheInformationINeed","thumb-down"],["太复杂/步骤太多","tooComplicatedTooManySteps","thumb-down"],["内容需要更新","outOfDate","thumb-down"],["翻译问题","translationIssue","thumb-down"],["示例/代码问题","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],[],[],[],null,["# Gaze Link\n\n[See all winners](/competition#w-4) \nBest Android app \n\nGaze Link\n=========\n\nHelps Amyotrophic Lateral Sclerosis (ALS) patients communicate with their eyes \nWhat it does\n\nAmyotrophic Lateral Sclerosis, or ALS, is a devastating disease that takes away the patient's ability to move and speak. As a volunteer in ALS associations during the summers of high school, I realized that some patients can only communicate with their eyes and assistive technology, which has many limitations like cost and efficiency. Powered by the Google Gemini API, my free multi-language app named \"Gaze Link\" helps ALS patients communicate with their eyes independently, accurately, and efficiently. \n\nFirst, I recognize the user's face and eyes with Google ML Kit and OpenCV. After a 30-second calibration and setting adjustments, the user can begin typing words on Gaze Link's multi-language keyboard with 6 eye gestures. However, eye-typing can be a very slow process for long sentences. \n\nTo improve the text-entry rate, I used a Gemini 1.5 Flash model to generate the patient's intended sentence based on keywords and the context. First, Gaze Link will transcribe the caretaker's voice into text like \"Is the room temperature ok?\". Then, the patient will type keywords like \"hot, AC, two\" with their eyes. The Gemini model will use the information to generate a suitable sentence like \"I am hot, can you turn the AC down by 2 degrees?\" in under a second. The model and keyboard also works with Spanish and Chinese. Experiments with 30 people show that the model can save up to 85% of user keystrokes and make Gaze Link 7x more effective than traditional E-transfer boards. \nBuilt with\n\n- Android\n- Firebase\n- Google ML Kit \nTeam \nBy\n\nXiangzhou Sun \nFrom\n\nUnited States \nMore winners \n[Outdraw.AI\nMost Creative app](/competition/projects/outdrawai) [Jayu\nBest Overall app](/competition/projects/jayu) \n[](/competition)"]]