{ "error": { "root_cause": [ { "type": "remote_transport_exception", "reason": "[node3-2][192.168.21.88:9301][cluster:admin/reroute]" } ], "type": "illegal_argument_exception", "reason": "[allocate_replica] allocation of [alarm-2017.08.12][0] on node {node4-1}{u47KtJGgQw60T_xm9hmepw}{UbaCHI4KRveQeTAnJvGFEQ}{192.168.21.89}{192.168.21.89:9301}{rack=r4, ml.enabled=true} is not allowed, reason: [NO(shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2017-08-16T00:54:47.088Z], failed_attempts[5], delayed=false, details[failed recovery, failure RecoveryFailedException[[alarm-2017.08.12][0]: Recovery failed from {node8}{Bpd3y--EQsag1u1NTmtZfA}{4T_McpmjSXqLowRoXztssQ}{192.168.21.89}{192.168.21.89:9301}{rack=r4} into {node5}{i4oG4VcaSdKVeNEvStXwAw}{w4nAITEOR9u7liR55qDsVA}{192.168.21.88}{192.168.21.88:9300}{rack=r3}]; nested: RemoteTransportException[[node8][192.168.21.89:9301][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] phase1 failed]; nested: RecoverFilesRecoveryException[Failed to transfer [0] files with total size of [0b]]; nested: FileSystemException[/opt/elasticsearch/elasticsearch-node8/data/nodes/0/indices/FgLdgYTmTfazlP8i5K0Knw/0/index: Too many open files in system]; ], allocation_status[no_attempt]]])][YES(primary shard for this replica is already active)][YES(explicitly ignoring any disabling of allocation due to manual allocation commands via the reroute API)][YES(target node version [5.5.1] is the same or newer than source node version [5.5.1])][YES(the shard is not being snapshotted)][YES(node passes include/exclude/require filters)][YES(the shard does not exist on the same host)][YES(enough disk for shard on node, free: [6.4tb], shard size: [0b], free after allocating shard: [6.4tb])][YES(below shard recovery limit of outgoing: [0 < 2] incoming: [0 < 2])][YES(total shard limits are disabled: [index: -1, cluster: -1] <= 0)][YES(allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it)]" }, "status": 400 }
注意看错误:
1
FileSystemException[/opt/elasticsearch/elasticsearch-node8/data/nodes/0/indices/FgLdgYTmTfazlP8i5K0Knw/0/index: Too many open files in system
applog-prod-2016.12.18 4 r STARTED 916460 666.4mb 192.168.21.24 node2-2 applog-prod-2016.12.18 4 p STARTED 916460 666.6mb 192.168.21.23 node1-3 applog-prod-2016.12.18 1 p STARTED 916295 672.8mb 192.168.21.88 node3-3 applog-prod-2016.12.18 1 r STARTED 916295 672.8mb 192.168.21.24 node2-3 applog-prod-2016.12.18 2 r STARTED 916730 670.9mb 192.168.21.89 node4-2 applog-prod-2016.12.18 2 p STARTED 916730 670.9mb 192.168.21.23 node1-3 applog-prod-2016.12.18 3 r STARTED 917570 674.9mb 192.168.21.23 node1-1 applog-prod-2016.12.18 3 p STARTED 917570 674.9mb 192.168.21.24 node2-2 applog-prod-2016.12.18 0 p STARTED 917656 673.5mb 192.168.21.88 node3-2 applog-prod-2016.12.18 0 r UNASSIGNED
现在修改number_of_replicas
1 2 3 4 5 6
PUT applog-prod-2016.12.18/_settings { "index":{ "number_of_replicas":0 } }
GET _cat/shards/applog-prod-2016.12.18* applog-prod-2016.12.18 4 p STARTED 916460 666.6mb 192.168.21.23 node1-3 applog-prod-2016.12.18 1 p STARTED 916295 672.8mb 192.168.21.88 node3-3 applog-prod-2016.12.18 2 p STARTED 916730 670.9mb 192.168.21.23 node1-3 applog-prod-2016.12.18 3 p STARTED 917570 674.9mb 192.168.21.24 node2-2 applog-prod-2016.12.18 0 p STARTED 917656 673.5mb 192.168.21.88 node3-2
对段进行合并:
POST /applog-prod-2016.12.18/_forcemerge?max_num_segments=1
之后再将number_of_replicas改回来
1 2 3 4 5 6
PUT applog-prod-2016.12.18/_settings { "index":{ "number_of_replicas":1 } }
分片情况:
1 2 3 4 5 6 7 8 9 10 11
GET _cat/shards/applog-prod-2016.12.18* applog-prod-2016.12.18 4 r INITIALIZING 192.168.21.89 node4-1 applog-prod-2016.12.18 4 p STARTED 916460 666.6mb 192.168.21.23 node1-3 applog-prod-2016.12.18 1 p STARTED 916295 672.8mb 192.168.21.88 node3-3 applog-prod-2016.12.18 1 r INITIALIZING 192.168.21.89 node4-3 applog-prod-2016.12.18 2 r STARTED 916730 670.9mb 192.168.21.89 node4-1 applog-prod-2016.12.18 2 p STARTED 916730 670.9mb 192.168.21.23 node1-3 applog-prod-2016.12.18 3 r STARTED 917570 674.9mb 192.168.21.89 node4-3 applog-prod-2016.12.18 3 p STARTED 917570 674.9mb 192.168.21.24 node2-2 applog-prod-2016.12.18 0 p STARTED 917656 673.5mb 192.168.21.88 node3-2 applog-prod-2016.12.18 0 r INITIALIZING 192.168.21.89 node4-1