Skip to content

Elasticsearch : Cannot detect ES version #791

@FK7

Description

@FK7

Hi,
I am using Amazon ElasticSearch service and seprate Spark cluster on EMR and trying to execute elasticsearch-hadoop Apache Spark writing example , However on submitting the job on my local before I am getting following exception as info.

16/06/22 13:56:26 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out: connect
16/06/22 13:56:26 INFO HttpMethodDirector: Retrying request
16/06/22 13:56:26 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out: connect
16/06/22 13:56:26 INFO HttpMethodDirector: Retrying request

and later on am getting following exception.

Strack trace:

org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
    at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:190)
    at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:379)
    at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
    at org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
    at org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[search-spark-elasticsearch-5cengch4w4hghs5coa5xu2tyoq.us-east-1.es.amazonaws.com:9200]] 
    at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:142)
    at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:434)
    at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:414)
    at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:418)
    at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:122)
    at org.elasticsearch.hadoop.rest.RestClient.esVersion(RestClient.java:564)
    at org.elast

Following is my code that I am running -

Code:

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.elasticsearch.spark.rdd.EsSpark

object SparkElasticSearch extends App {

  val conf = new SparkConf().setAppName("ElasticSearchTest").setMaster("local[*]")

  val sc = new SparkContext(conf)
  var esconf = Map("es.nodes" -> "search-spark-elasticsearch-xxxxx.xxxx.es.amazonaws.com", "es.nodes.wan.only" -> "true")

  val numbers = Map("one" -> 1, "two" -> 2, "three" -> 3)
  val airports = Map("OTP" -> "Otopeni", "SFO" -> "San Fran")

  import org.elasticsearch.spark._
  sc.makeRDD(Seq(numbers, airports)).saveToEs("spark/docs", esconf)

}

Maven dependency

<dependency>
  <groupId>org.elasticsearch</groupId>
  <artifactId>elasticsearch-hadoop</artifactId>
  <version>2.3.2</version>
</dependency>

Is It a bug or am I missing something here ?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions