Define workflow and sample code to grep 500k terms on 0.5GB text with AWS
$30-250 USD
已取消
已发布超过 7 年前
$30-250 USD
货到付款
I have two files. A) file with 500k terms. B) file with 0.5 GB raw text. I want to return matching lines in B which contain A. Max 50 lines. desired output: term 1: line A; term 1: line B; term 2: line C, etc. I am familiar with EC2, but I am hoping there's an efficient way to set this up (map reduce?)
Hi!
May you please let me know if I have this right - for each term in A you would like a maximum of 50 lines from B returned. Or rather - you would like a maximum of 50 lines returned for all the terms in A?
If this is a one time process, I would be happy to use my own servers for this.
Regards,
Ku