
In Progress
Posted
Paid on delivery
**Translation into English:** --- **No Restriction on Programming Language: Text Processing Task** Extract group chat records from a large text file and slowly write them into individual small text files in the target folder. --- ### GitHub Repository Requirements document and text to be processed are available at the GitHub repository: [[login to view URL]]([login to view URL]) --- ### Summary of Requirements Extract the WhatsApp group chat records from a large text file `[login to view URL]` located in the `HistoryMsg` directory and slowly release them into the target folder. Each message should become a small file where the content is the translated version of the message. For images or other attachments (images do not require translation), release them as individual files, one file per attachment. After writing each small file, wait for a random time between **10 and 100 seconds**. --- ### Key Details 1. **Programming Language & Runtime Environment** - No restriction on the programming language. - The runtime interpreter and program itself must use minimal memory, ideally not exceeding **100MB**. 2. **Translation** - Use free translation solutions like the Google Translate API, Google Gemini free-tier LLM, or others. - Translation options: - Translate the entire `[login to view URL]` at once and save it as `[login to view URL]` (Do not translate sender names or attachment descriptions). - Alternatively, translate each message individually when processing. 3. **Processing Logic** (Reference Logic) - Start iterating over all subfolders in `HistoryMsg` (created more than 10 minutes ago, sorted by name): - Read the `[login to view URL]` file in the subfolder. - Iterate through each message: - Extract sender and message content: - **Sender**: Extract English characters up to the first Chinese character, excluding Chinese text. - **Message Content**: - If text: - Translate to Chinese (if in English) using Google Translate. - Write to a text file in the target folder: `{Sender} said: {Translated Message}`. - If an attachment (e.g., image): - Write a text file with content: `{Sender} sent an image (or file) as follows:`. - Copy the attachment file to the target folder with a new name. - If multiple images/files in sequence, repeat but ensure a delay of at least **1 second** between copying each file. - After successfully processing a message, delete it from `[login to view URL]` and save the file to prevent re-processing in case of program crash. - Wait for a random delay (adjustable, e.g., 30 to 300 seconds). - Delete the subfolder after processing all its contents. - Continue with the next subfolder. - After processing all folders, send a notification to the administrator. - Wait 5 minutes and check for new subfolders. Repeat the process if found. 4. **Special Cases** - If the attachment is an OPUS file (audio), convert it to an MP3 file. --- ### Example of Consecutive Messages with Multiple Images ``` [29/06/23 10:40:47 AM] ~ Evan HO:<Attachment: [login to view URL]> [29/06/23 10:41:00 AM] ~ Evan HO:<Attachment: [login to view URL]> [29/06/23 10:41:12 AM] ~ Evan HO:<Attachment: [login to view URL]> ``` --- ### Examples of Sender Name Processing - **[login to view URL]汪先生** → `[login to view URL]` - **Stephanie来自邻居群Whatsapp** → `Stephanie` - **Coo ?上海人 英国工作 每年来马一次** → `Coo` - **Cecilia 老公 叫Rick Khoo T1-30-02** → `Cecilia` --- ### Configurable Parameters - **Target Folder Path** - **Random Delay for Each Message** (e.g., 10-100 seconds). --- ### File Naming in Target Folder - Text file: `whatsapp_03日_23.34.02 - {strReceiverID}.txt` - Image file: `whatsapp_03日_23.34.02 - {strReceiverID}.jpg` (or other extensions) Where `03日_23.34.02` is the current date and time. ### Configuration File Example ``` ReceiverGroupID = "@@123456789" MinIntervalSeconds = 10 MaxIntervalSeconds = 100 TargetDir = "/root/Dir-SendingOut" ``` --- ---------------- **original Chinese requirement** 不限编程语言:文本处理:把文本大文件中的群聊记录提取出缓慢写到到目标文件夹的一个个小的文本文件中 需求文档与需要处理的文本在GitHub仓库:[login to view URL] 需求概要:把位于 HistoryMsg 目录中的一个大文本文件 [login to view URL] 中的 WhatsApp 群聊记录 缓慢释放 到目标文件夹, 并翻译成中文,变成一个个小文件,每条消息即为一个小文件,文件内容就是这一条消息翻译后的文本。还有图片或其他附件(图片不用翻译),也同样一个文件一个文件的释放。每写一个小文件后,随机等待10秒-100秒。 不限编程语言,但运行时解释器与程序本身占用内存要小,最好不超过100M。 翻译可调用 Google Translate API 或 Google Gemini 的免费版 LLM 或 其他免费翻译方案。 比如在一开始就只调用一次 翻译API 把整个 [login to view URL] 全部翻译好,并保存到 [login to view URL] 文件中。(注意:不要翻译说话人的名字,不要翻译附件说明)。 也可以每条消息都调用一次翻译API。 以下逻辑仅供开发者参考。可以用更好的办法。 开始循环iterate:从 HistoryMsg 下的所有(创建时间是10分钟之前的)子文件夹(按文件名排序): 从子文件夹中读取 [login to view URL] 开始循环:一条一条读所有消息 读一条,分别取出说话人与说话的内容 说话人 = 左边的英语字符,直到出现第一个中文字为止。不要包含中文。 如果说话内容是文本,则 说话内容 = 如为英语,调用google translate翻译为中文。 写1个文本文件到目标文件夹:{说话人}说:{说话内容} 如果说话内容是 附件文件(如图片),则 写第1个文本文件,内容:{说话人}发图(或文件)如下: 第2个文件:把图片复制到目标文件夹,并rename为第2个文件 第3个文件,如果连发多图,则会有第3、第4个文件。但文本文件只有1个。如有多图(或多附件),每复制与rename一个文件必定间隔1秒以上。 如果以上所有操作 正常 无错,则删除[login to view URL]中的这一条消息,并回写入[login to view URL]文件 (每删除一条消息后,必须要写回入文件,以免程序崩溃后,重启后 会把已经处理过的消息再次重新处理) 在此等待:每条消息随机间隔的时间可调,如:30秒 - 300秒 继续循环:下一条消息 所有消息全读完并处理完后,删除此 子文件夹 继续循环:下一个子文件夹 所有文件夹处理完后,写一条消息给管理员 等候 5分钟,再看有无新的 子文件夹 ,若有,则重复以上处理 如果附件文件是OPUS文件(一种音频文件),需转成MP3文件。 一人连发多图的消息举例: [29/06/23 上午10:40:47] ~ Evan HO:<附件:[login to view URL]> [29/06/23 上午10:41:00] ~ Evan HO:<附件:[login to view URL]> [29/06/23 上午10:41:12] ~ Evan HO:<附件:[login to view URL]> 说话人昵称如下处理举例,删除线表示不需要的部分: [login to view URL]汪先生 Stephanie来自邻居群Whatsapp Coo ?上海人 英国工作 每年来马一次 Cecilia 老公 叫Rick Khoo T1-30-02 配置文件内有以下参数可配置: 目标文件夹位置 每条消息随机间隔的时间可调,如:10秒 - 100秒 目标文件夹内的文件命名规则: 文本消息文件名为:whatsapp_03日_23.34.02 - {strReceiverID}.txt 图片文件名为:whatsapp_03日_23.34.02 - {strReceiverID}.jpg 或png或其他文件后缀 03日_23.34.02 是当前日期与时间,因每次新写小文件必定间隔1秒以上,所以理论上是不会出现文件名重名的。 strReceiverID变量值从配置文件ReceiverGroupID中取 文件内容 是 要发送的微信消息内容或图片本身。 配置文件中的配置项 ReceiverGroupID = "@@123456789" MinIntervalSeconds = 10 MaxIntervalSeconds = 100 TargetDir = "/root/Dir-SendingOut" 编程语言与内存 编程语言不限,但运行服务器环境为:Debian 11x64 1G 内存 内存较小,且还有其他App在运行。 运行服务器上已安装有Python3. 因此实现方案需节省内存:占用内存不能超过200M,最好小于100M
Project ID: 38864085
4 proposals
Remote project
Active 1 yr ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
4 freelancers are bidding on average $155 USD for this job

Hello sir, I have 9 years of experience in MERN development. Let's connect for a detailed discussion. Best Regards, Bhargav
$200 USD in 7 days
6.7
6.7

Hi, I can create No Restriction on Programming Language: Text Processing Task Extract group chat records from a large text file and slowly write them into individual small text files in the target folder. I am an experienced Web developer and work on crypto currency development and equipped with all the necessary skills to provide you best website that completely satisfies your business needs. Please share your requirements with me over chat so we can proceed further. Best Regards, Neha
$140 USD in 7 days
5.4
5.4

Hi, since 15 years ago, I have developed website design, website UI development, Database design and QA testing by myself. Also I have developed a wide variety of the projects using the newest web frameworks and third party libraries. My main skills include: - React Js, CoffeScript, jQuery, Vue Js, Ember Js, D3 Js, Antd Design, Tailwind Css, MUI, Vuetify, Bootstrap-vue, Buefy, Sass, - Django, ROR, Laravel, Python, Node.js - Redis, MySql, MongoDB, Sqlite, Database Design - AWS, Google Cloud, Azure, Digital Ocean, Heroku, Firebase My extra skills include: - Next Js, Nuxt Js, Shopify, Magento - ReactNative, Cordova/Phonegap, Ionic - Nginx, Apache, Caddy, Tomcat - Ubuntu, CenOS, Gitlab CI, Azure DevOps, CI/CD, Webflox If you want, I can show you more awesome applications I've done before. Please contact me, then I will show you my good result and prove myself. Looking forward to hearing from you soon. Thank you.
$140 USD in 7 days
0.0
0.0

As a passionate IT professional with diverse skills in Software Development and Quality Assurance, I believe I am a valuable asset for your project. My extensive experience in Python development and proficiency in programming under constrained-memories perfectly aligns with the requirements of your project. Having an in-depth understanding of Google Translate API and Google Gemini's LLM for effective translation integrates seamlessly into your need to translate texts as every message is analyzed and processed.
$140 USD in 7 days
0.0
0.0

Kuala Lumpur, Malaysia
Payment method verified
Member since Dec 6, 2024
₹12500-37500 INR
$2-8 USD / hour
$250-750 USD
$15-25 USD / hour
₹12500-37500 INR
$15-25 USD / hour
$30-250 AUD
$250-750 AUD
₹12500-37500 INR
$250-750 USD
₹3000-4000 INR
₹12500-37500 INR
₹750-1250 INR / hour
$250-750 USD
₹37500-75000 INR
$1500-3000 USD
$250-750 USD
€250-750 EUR
$250-750 USD
₹75000-150000 INR