In collaboration with the lab of Katie Pollard at UCSF we have produced a dataset of protein families called Sifting Families or SFams.
For more detail see Thomas J Sharpton, Guillaume Jospin, Dongying Wu, Morgan GI Langille, Katherine S Pollard and Jonathan A Eisen. Sifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource. BMC Bioinformatics 2012, 13:264 doi:10.1186/1471-2105-13-264
Data from the paper is available at http://edhar.genomecenter.ucdavis.edu/sifting_families/