BCF Tool
Software for the estimation of log Bioconcentration Factors (BCF) using three chemical properties: biotransformation half-life (log BioHL), octanol-water distribution coefficient (log D) and topological polar surface area (TPSA).
Software and data required:
- R installed (external pagehttp://cran.r-project.org/call_made)
- Two R packages:
- randomForest
- caret
- The log BCF estimation model itself (Downloadlog_BCF_by_RF (2.6 MB)vertical_align_bottom)
- Log BioHL (Should be estimated with EPI Suite version 4.10.)
- Log D (Should be estimated with SPARC version 4.6.)
- TPSA (Should be estimated with Open Babel version 2.2.0. The file DownloadTPSA.txt (TXT, 4.2 MB)vertical_align_bottom includes over 90000 precalculated TPSAs along the CAS and SMILES of each substance.)
Installation and usage:
# install R itself and open the provided command line interface or call it with "R" on your shell
# install packages if not already installed
install.packages('randomForest')
install.packages('caret')
# load the libraries
library(randomForest)
library(caret)
# load the BCF model
load('directory/log_BCF_by_RF')
# create data frame (the descriptors [column names] have to be named like in the following example)
to_predict <- data.frame(logD=c(3.33), logBioHL=c(-0.92), TPSA=c(20.23))
# or read in a file with prepared data (and omit not available data), for instance
to_predict <- na.omit(read.csv('filepath'))
# if the header was different than the expected "logD", "logBioHL" and "TPSA", assign the columns to a new data frame with "c1", "c2", "c3" being the corresponding column names in your file
to_predict_new <- data.frame(logD=to_predict$c1, logBioHL=to_predict$c2, TPSA=to_predict$c3)
# or just change the names of the columns directly; for instance if logD was the first, logBioHL was the second and TPSA was the third column in your file
names(to_predict) <- c('logD', logBioHL' and 'TPSA')
# calculate log BCFs using the model named "log_BCF_by_RF" and the data in "to_predict"
logBCF <- predict(log_BCF_by_RF, newdata=to_predict)
# print log BCFs
print(logBCF)
Citation:
Strempel S, Nendza M, Scheringer M, Hungerbühler K. external pageUsing conditional inference trees and random forests to predict the bioaccumulation potential of organic chemicalscall_made. Environ Toxicol Chem.
Notes and license:
Please report any bugs or suggestions for improvement to the authors mentioned in the citation above.
The "log_BCF_by_RF" software is licensed under external pageGPLv2call_made.
Acknowledgement:
This work was supported by the EU 6th Framework Integrated Project OSIRIS (contact no. GOCE ET-2007-037017) external pagehttp://www.osiris-reach.eu/call_made.