Many developers have access to the Verisign TLD Zone files. A common question is what is the easiest way to extract the domain names for processing by another program. Sometimes you simply need to extract the domain names for insert into a database such as MSSQL or MySQL.
The good news is that it’s not that hard at all. All you need to do is grep the file like so:
For .NET domains:
grep “^[a-zA-Z0-9-]\+ NS .*” net.zone|sed “s/NS .*//”|uniq >> netdomains.txt
For .COM domains:
grep “^[a-zA-Z0-9-]\+ NS .*” com.zone|sed “s/NS .*//”|uniq >> comdomains.txt
For .EDU domains:
grep “^[a-zA-Z0-9-]\+ NS .*” edu.zone|sed “s/NS .*//”|uniq >> edudomains.txt
Grepping these files will create a file that just contains the list of domains. Each domain will appear on a new line. The output file will be missing the .com/.net/.edu extensions and you will need to add the correct extension (depending on the file) during your import process, or in you code after the import.
This process is the same on Mac OSX, BSD and other UNIX or Linux variants.
It’s really that simple!













Recent Comments