Classification is hard, can network science help?分类是很难, CAN网络的科学帮助?
September 7th, 2004 | by ian | 2004年9月7日|由伊恩|So I was inspired by a article by a friend of mine in thinking about his post:因此,我的灵感一的文章,一位朋友在矿井思考他的职务: Science is easier from the outside科学是比较容易从外 . 。 Given my background in experimental evolutionary biology I thought maybe I would throw a few comments his way, then my few comments combined to form something which probably oversteps the bounds of what can be considered a comment.鉴于我的背景,在进化生物学的实验,我以为也许我会扔几点意见,他的方式,那么我的几点意见相结合,形成了一些有可能超越的界限,什么是可以考虑的评论。
Classification in Biology, or phylogenetics, is fraught with issues that we typically do not face when creating our own systems of classification such as organization of content content on a website.在生物学分类,或系统发育,是充满了问题,我们通常不面对时,创造我们自己的系统分类,如组织内容的内容在网站上。 Just look at the issues Anthropologists have in studying human evolution which, geologically speaking, happened yesterday.只看问题,人类学家已在研究人类进化,地质来说,昨天发生的。
When studying “trees of life” there is the necessarily subjective nomination of a phylogenetic root which causes biases in analysis of the rest of the hierarchy that are impossible to avoid (instead we often run many thousands of iterations of analysis on a dataset varying the choice of root that often yields radical differences).学习时“树木的生活”是有必然是主观的提名一位进化的根源,导致偏见,在分析其余的层次结构是不可能的,以避免(而不是我们常运行许多数以千计的迭代分析,对一个DataSet不同的选择的根,往往产量激进的分歧) 。 Think about it.想一想。 How would you go about choosing the root of the tree?你会如何去选择的根树?
Mismatches between genetic, morphological and life history based phylogenies abound: what data will you favour?错配之间的遗传,形态和生活史为基础的系统发育比比皆是:什么样的资料,将您是否赞成? You might think genetics is the most objective form of classification data but this is often problematic:您可能会认为遗传学是最客观的形式的分类数据,但是这往往是有问题的:
- it is likely you have much less genetic information to work with (morphology preserves more easily than genetic information)可能是你有少得多的遗传信息工作(形态保存更容易遗传信息)
- genes can be transferred between species via mobile elements, especially in the microbial and plant worlds which make up the majority of life on earth基因可以转到物种之间通过移动的要素,尤其是在微生物和植物的世界,这弥补大部分的地球上的生命
- genes can converge to the point where they look like they may have diverged from a common ancestor基因可以衔接点,他们样子,他们可能有分歧,从一个共同的祖先
Convergence is a problem since it can happen at all levels including genetic, morphological and life history (compared traits evolve separately and converge due to selective pressures and do not indicate shared ancestor).衔接是一个问题,因为它可以发生在所有各级包括遗传,形态和生活史(性状的演变相比,分别和衔接,由于选择性的压力和并不表明的共同祖先) 。
This is all further compounded by gaps in the fossil record:这是所有进一步复杂化的差距,在化石记录:
- Different body structures and environments determine the ease of fossilization so the fossil record is biased.不同的车身结构和环境,确定易用性,使僵化的化石记录是失之偏颇。
- Speciation can happen in the blink of a geological eye, so to speak, both in terms of the generation of diversity and the susequent sorting (selection).形态可以发生在闪烁的地质眼睛,可以这么说,两方面来看,这一代的多样性和susequent排序(选择) 。 It is quite a detective story to determine who the suspects are…它是一个相当侦探小说,以确定谁是嫌疑人…
Carl von Linné, the father of modern biological taxonomy, didn’t even have the benefit of understanding evolutionary processes let alone genetics when he developped his Systema Naturae.卡尔冯linné ,父亲的现代生物分类学,甚至没有受惠了解进化过程,更遑论遗传学时,他开发他的systema naturae 。 Instead he thought he was revealing the divine order in God’s creations.相反,他认为他是揭示了神圣的秩序,上帝的创造。 As a result of this starting assumption and very limited data set that didn’t include much in the way of non-morphological information his original constructions, while logical given what he had to work with, often did not reflect the natural-historical order.由于这个出发的假设和非常有限的数据集,这并不包括在很多方式的非形态的资料,他原来的建设,而合乎逻辑的,因为他的工作,往往并不反映自然的历史秩序。
The wild endeavour of science is one of discovery not invention, which we will leave to engineers.野生的努力科学是一发现不是发明,这是我们将留给工程师。 Scientists don’t have the luxury of constructing our world (and when they indulge in that luxury they often take us down the wrong path…not that thats a bad thing!).科学家们不具备的豪华建设我们的世界(当他们沉醉在这豪华,他们往往采取了我们错了路…不说,这就是一件坏事! ) 。 It is a process of discovery fraught with accidental success, abject failure, Eureka moments.它是一个过程,发现充满了意外的成功,失败的赤贫,尤里卡时刻。
Classification is such a fundamental aspect of science, but it is also a wholly human one.分类是这样的一个基本方面,科学,但它也是一个完全一个人。 A classification system can both be wildly useful and fundamentally flawed.分级制度都可以疯狂有用的和根本性的缺陷。 What happens when something needs to go on two branches that are far apart in the classification structure?时会发生什么东西需要去对两个分行相距遥远,在分类结构呢?
Maybe a tree with a root and branches is the wrong way to look at classification.也许树的根和枝叶是错误的方式看分类。 Perhaps we need to navigate a network of organization instead to find a happy home for everything, connected to all things related and far apart from that which is not.也许我们需要浏览的网络组织,而是找到一个幸福的家,一切联系到所有相关的东西,和相距遥远,从那些不是。 I admit that I am inspired here having recently read the book我承认我的灵感在这里有最近看过这本书 Six Degrees: The Science of a Connected Age 6度:科学的关连年龄 which I believe to be the best account of why studying networks and their behaviours is relevant to all disciplines.我相信是最好的帐户,为什么学习网络和他们的行为是有关的所有学科。
The likely problem is that conceptually and possibly even mathematically a network approach to classification might be too difficult for us!可能的问题是,在概念上,甚至可能在数学方面采取网络方式进行分类,可能为时已,我们很难!














