Classification is hard, can network science help?分類是很難, CAN網絡的科學幫助?
September 7th, 2004 | by ian | 2004年9月7日|由伊恩|So I was inspired by a article by a friend of mine in thinking about his post:因此,我的靈感一的文章,一位朋友在礦井思考他的職務: Science is easier from the outside科學是比較容易從外 . 。 Given my background in experimental evolutionary biology I thought maybe I would throw a few comments his way, then my few comments combined to form something which probably oversteps the bounds of what can be considered a comment.鑑於我的背景,在進化生物學的實驗,我以為也許我會扔幾點意見,他的方式,那麼我的幾點意見相結合,形成了一些有可能超越的界限,什麼是可以考慮的評論。
Classification in Biology, or phylogenetics, is fraught with issues that we typically do not face when creating our own systems of classification such as organization of content content on a website.在生物學分類,或系統發育,是充滿了問題,我們通常不面對時,創造我們自己的系統分類,如組織內容的內容在網站上。 Just look at the issues Anthropologists have in studying human evolution which, geologically speaking, happened yesterday.只看問題,人類學家已在研究人類進化,地質來說,昨天發生的。
When studying “trees of life” there is the necessarily subjective nomination of a phylogenetic root which causes biases in analysis of the rest of the hierarchy that are impossible to avoid (instead we often run many thousands of iterations of analysis on a dataset varying the choice of root that often yields radical differences).學習時“樹木的生活”是有必然是主觀的提名一位進化的根源,導致偏見,在分析其餘的層次結構是不可能的,以避免(而不是我們常運行許多數以千計的迭代分析,對一個DataSet不同的選擇的根,往往產量激進的分歧) 。 Think about it.想一想。 How would you go about choosing the root of the tree?你會如何去選擇的根樹?
Mismatches between genetic, morphological and life history based phylogenies abound: what data will you favour?錯配之間的遺傳,形態和生活史為基礎的系統發育比比皆是:什麼樣的資料,將您是否贊成? You might think genetics is the most objective form of classification data but this is often problematic:您可能會認為遺傳學是最客觀的形式的分類數據,但是這往往是有問題的:
- it is likely you have much less genetic information to work with (morphology preserves more easily than genetic information)可能是你有少得多的遺傳信息工作(形態保存更容易遺傳信息)
- genes can be transferred between species via mobile elements, especially in the microbial and plant worlds which make up the majority of life on earth基因可以轉到物種之間通過移動的要素,尤其是在微生物和植物的世界,這彌補大部分的地球上的生命
- genes can converge to the point where they look like they may have diverged from a common ancestor基因可以銜接點,他們樣子,他們可能有分歧,從一個共同的祖先
Convergence is a problem since it can happen at all levels including genetic, morphological and life history (compared traits evolve separately and converge due to selective pressures and do not indicate shared ancestor).銜接是一個問題,因為它可以發生在所有各級包括遺傳,形態和生活史(性狀的演變相比,分別和銜接,由於選擇性的壓力和並不表明的共同祖先) 。
This is all further compounded by gaps in the fossil record:這是所有進一步複雜化的差距,在化石記錄:
- Different body structures and environments determine the ease of fossilization so the fossil record is biased.不同的車身結構和環境,確定易用性,使僵化的化石記錄是失之偏頗。
- Speciation can happen in the blink of a geological eye, so to speak, both in terms of the generation of diversity and the susequent sorting (selection).形態可以發生在閃爍的地質眼睛,可以這麼說,兩方面來看,這一代的多樣性和susequent排序(選擇) 。 It is quite a detective story to determine who the suspects are…它是一個相當偵探小說,以確定誰是嫌疑人…
Carl von Linné, the father of modern biological taxonomy, didn’t even have the benefit of understanding evolutionary processes let alone genetics when he developped his Systema Naturae.卡爾馮linné ,父親的現代生物分類學,甚至沒有受惠了解進化過程,更遑論遺傳學時,他開發他的systema naturae 。 Instead he thought he was revealing the divine order in God’s creations.相反,他認為他是揭示了神聖的秩序,上帝的創造。 As a result of this starting assumption and very limited data set that didn’t include much in the way of non-morphological information his original constructions, while logical given what he had to work with, often did not reflect the natural-historical order.由於這個出發的假設和非常有限的數據集,這並不包括在很多方式的非形態的資料,他原來的建設,而合乎邏輯的,因為他的工作,往往並不反映自然的歷史秩序。
The wild endeavour of science is one of discovery not invention, which we will leave to engineers.野生的努力科學是一發現不是發明,這是我們將留給工程師。 Scientists don’t have the luxury of constructing our world (and when they indulge in that luxury they often take us down the wrong path…not that thats a bad thing!).科學家們不具備的豪華建設我們的世界(當他們沉醉在這豪華,他們往往採取了我們錯了路…不說,這就是一件壞事! ) 。 It is a process of discovery fraught with accidental success, abject failure, Eureka moments.它是一個過程,發現充滿了意外的成功,失敗的赤貧,尤里卡時刻。
Classification is such a fundamental aspect of science, but it is also a wholly human one.分類是這樣的一個基本方面,科學,但它也是一個完全一個人。 A classification system can both be wildly useful and fundamentally flawed.分級制度都可以瘋狂有用的和根本性的缺陷。 What happens when something needs to go on two branches that are far apart in the classification structure?時會發生什麼東西需要去對兩個分行相距遙遠,在分類結構呢?
Maybe a tree with a root and branches is the wrong way to look at classification.也許樹的根和枝葉是錯誤的方式看分類。 Perhaps we need to navigate a network of organization instead to find a happy home for everything, connected to all things related and far apart from that which is not.也許我們需要瀏覽的網絡組織,而是找到一個幸福的家,一切聯繫到所有相關的東西,和相距遙遠,從那些不是。 I admit that I am inspired here having recently read the book我承認我的靈感在這裡有最近看過這本書 Six Degrees: The Science of a Connected Age 6度:科學的關連年齡 which I believe to be the best account of why studying networks and their behaviours is relevant to all disciplines.我相信是最好的帳戶,為什麼學習網絡和他們的行為是有關的所有學科。
The likely problem is that conceptually and possibly even mathematically a network approach to classification might be too difficult for us!可能的問題是,在概念上,甚至可能在數學方面採取網絡方式進行分類,可能為時已,我們很難!














