BCF Tool

Software for the estimation of log Bioconcentration Factors (BCF) using three chemical properties: biotransformation half-life (log BioHL), octanol-water distribution coefficient (log D) and topological polar surface area (TPSA).

Software and data required:

  1. R installed (external pagehttp://cran.r-project.org/)
  2. Two R packages:             
    • randomForest
    • caret
  3. The log BCF estimation model itself (Downloadlog_BCF_by_RF (2.6 MB))
  4. Log BioHL (Should be estimated with EPI Suite version 4.10.)
  5. Log D (Should be estimated with SPARC version 4.6.)
  6. TPSA (Should be estimated with Open Babel version 2.2.0. The file DownloadTPSA.txt (TXT, 4.2 MB) includes over 90000 precalculated TPSAs along the CAS and SMILES of each substance.)

Installation and usage:

# install R itself and open the provided command line interface or call it with "R" on your shell

# install packages if not already installed
install.packages('randomForest')
install.packages('caret')

# load the libraries
library(randomForest)
library(caret)

# load the BCF model
load('directory/log_BCF_by_RF')

# create data frame (the descriptors [column names] have to be named like in the following example)
to_predict <- data.frame(logD=c(3.33), logBioHL=c(-0.92), TPSA=c(20.23))

# or read in a file with prepared data (and omit not available data), for instance
to_predict <- na.omit(read.csv('filepath'))
# if the header was different than the expected "logD", "logBioHL" and "TPSA", assign the columns to a new data frame with "c1", "c2", "c3" being the corresponding column names in your file
to_predict_new <- data.frame(logD=to_predict$c1, logBioHL=to_predict$c2, TPSA=to_predict$c3)
# or just change the names of the columns directly; for instance if logD was the first, logBioHL was the second and TPSA was the third column in your file
names(to_predict) <- c('logD', logBioHL' and 'TPSA')

# calculate log BCFs using the model named "log_BCF_by_RF" and the data in "to_predict"
logBCF <- predict(log_BCF_by_RF, newdata=to_predict)

# print log BCFs
print(logBCF)

Citation:

Strempel S, Nendza M, Scheringer M, Hungerbühler K. external pageUsing conditional inference trees and random forests to predict the bioaccumulation potential of organic chemicals. Environ Toxicol Chem. 

Notes and license:

Please report any bugs or suggestions for improvement to the authors mentioned in the citation above.

The "log_BCF_by_RF" software is licensed under external pageGPLv2.

Acknowledgement:

This work was supported by the EU 6th Framework Integrated Project OSIRIS (contact no. GOCE ET-2007-037017) external pagehttp://www.osiris-reach.eu/.

JavaScript has been disabled in your browser