I want to load a lot of data, maybe 3 gb into weka. There are about 1000 fields. I am having trouble getting various types of errors and would like to find a Weka expert to assist in this. The data includes characters from two byte languages like Chinese, Japanese, and Thai. This is short project. I have csv files and have used R to write them as .arff and also saved them as xlsx files. I need someone just to make these files readable by WEKA. The issues include wrong data types created by [login to view URL]() in R, unable to load csv files, spurious line feed and new line characters causing EOL errors, and "number expected, found TOKEN[]. The data is proprietary so you have to work through teamviewer. I am using windows and mac. Again, one major issue is Japanese and Chinese language. I have used regex lookbehind to remove LF but it is through notepad++ which cannot handle big files. Success is defined as being able to successfully load these files to weka.