PHPでXPathスクレイピング

CME日経先物の最新四本値取得

$html = file_get_contents("http://www.cmegroup.com/trading/equity-index/international-index/nikkei-225-yen_quotes_settlements_futures.html");

$dom = new DOMDocument();
@$dom->loadHTML($html);
$xml = simplexml_import_dom($dom);


$ret = $xml->xpath("//*[@id=\"cmeTradeDate\"]/option[@selected='selected']");
//var_dump($ret);

$yyyymmdd = "19000101";
foreach( $ret as $e  ){
	$yyyy = substr($e["value"],6,4);
	$mm = substr($e["value"],0,2);
	$dd = substr($e["value"],3,2);

	$yyyymmdd = $yyyy . $mm . $dd;

	//echo $yyyymmdd;
	//echo $e;
}

$ret = $xml->xpath("//*[@id=\"settlementsFuturesProductTable\"]/tbody/tr[1]");
//var_dump($ret);

$outStr = "";
foreach( $ret as $e  ){

	$open = $e->td[0];
	$high = $e->td[1];
	$low = $e->td[2];
	$last = $e->td[3];
	$last = substr($last,0,-1);
	$volume = $e->td[6];
	$volume = str_replace(",","",$volume);

	$outStr = "," . $yyyymmdd . "," . $open . "," . $high . "," . $low . "," . $last . "," . $volume;
}

ネタ元

追記

ChromeでXpath取得したとき、tbody入ってる場合、抜かないとエラーになることある