I need help to write script that can parse and structure 2 text files into database table.
This files are export from a pdf book with articles and follow a pattern that can be used for parsing the file.
The content is written cyrillic characters , so need someone that can at least understand character set.
Due to content complexity , I'm fine with just getting to columns of data. Article_title and article_content.
1. Always starts with a word and/or sentence with ALL CAPITAL Letters:
ДОБАР И ПОДОБАР СЕ ЗАЕДНО
2. Always ends with " - " character and its followed by article_content
3. May include lowercase word after comma " , " when article is about some person
1. All text including the title until the beginning of the new article that starts with ALL CAPS
There are few other cases , but this would be the base
43 威客就此工作平均出价 $156
Hey, I hope to see you in chat. Though I am new to freelancer.com I am an experienced python developer with full-stack knowledge and career. I'm sure I can do this perfectly. Thanks for your kind attention.
Hello. I have rich experiences in MySQL, Perl, PHP, Python, Regular Expressions. I have read your project description carefully and i can do it perfectly. Contact me please. Thanks.
I can parse this with Python and export to mysql or sqlite db (or mongo) I also understand cyrillic as I have studied russian a little bit. But what are your specification about the DB schema?