Monday, August 31, 2015

The solution to the spinach fallacy?


Last week I blogged about Spinach and the iron fallacy. I analysed an early set of data by Thomas Richardson (1848), who calculated the amount of iron in combusted ash for various vegetables and fruits, and showed that spinach is not at all unusual in its constituents. The idea that spinach is rich in iron is untrue, and the story about a mis-placed decimal point seems to be nothing more than an urban myth.

In the meantime, Joachim Dagg, at the Natural History Apostilles blog, has reanalysed Richardson's data and revealed that The first source for the spinach-iron myth is likely to have been a somewhat inappropriate attempt to combine his data for the percent iron values in relation to the ash with the percent values of the ashes in in relation to the fresh matter.

So, I have recalculated the phylogenetic network using these "adjusted" values. I used the percent values of the chemical constituents in relation to the pure ash (raw ash minus carbonic acid, charcoal and sand), and combined them with the percent values of the ashes. The issue here is that radish roots and leaves have the largest ash values, followed by cherry stems and spinach. This leads to an over-statement of the chemical contents. In particular, the iron content moves spinach from being ranked sixth to second (behind radish foliage, which is not usually eaten).


Wednesday, August 26, 2015

Request for datasets


During one of the discussion sessions at the recent Phylogenetic Network Workshop, in Singapore, the need was re-iterated for "gold standard" empirical datasets, in order to aid the development and validation of algorithms for phylogenetic networks.

The current collection of such datasets is located on this blog, at:
http://phylonetworks.blogspot.se/p/datasets.html
However, it is still quite a small database, as so far it has been based solely on my own ability to locate suitable datasets that are freely available (see the comments in Public availability of phylogenetic data).

I would therefore like to remind everyone that if you have, or know of, suitable empirical datasets then please contact me.

The database is currently hierarchically arranged as follows:

Datasets where the history is a tree
  Datasets where the history is known from experimentation
  Datasets where the history is known from retrospective observation
Datasets where the history is reticulated
  Datasets where the history is known from experimentation
    Hybridization
    Contamination
  Datasets where the reticulation is inferred
    Hybridization
    Recombination
    Lateral Gene Transfer

The basic requirement for a "gold standard" dataset that contains one or more reticulations (ie. there is gene flow) is that the evidence for the reticulation(s) is independent of the particular dataset. That is, there should be either experimental data, or at least another independent dataset, confirming the gene flow. This is quite a tough criterion, particularly for lateral gene transfer, but it is a necessary quality criterion.

Finally, the database requires the processed data (eg. a multiple sequence alignment), rather than the original raw data (see the comments in Releasing phylogenetic data).