# 五、MongoDB 索引

# 1. MongoDB 索引原理

# 2. MongoDB 索引应用

db.collection.createIndex(<keys>, <optionals>): 创建索引
- optionals
  - unique: 唯一性索引
  - sparse: 稀疏索引
  - expireAfterSeconds: 针对日期字段，或者包含日期元素的数组字段，可以使用设定了生存时间的索引，来自动删除字段值超过生存时间的文档
    - 复合建索引不具备生成时间特性
    - 数组字段的话，最小日期会被用来计算是否过期
    - 数据库使用一个后台线程来检测和删除过期的文档，具有延迟
db.collection.getIndexes(): 获取当前集合索引
db.collection.dropIndex(indexName or {index definition}): 删除索引

先创建一个集合，用来做实验

db.accountsWithIndex.insertMany([
	{
		name: "Alice", balance: 50, currency: ["GBP", "USD"]
	},
	{
		name: "Bob", balance: 20, currency: ["AUD", "USD"]
	},
	{
		name: "Bob", balance: 300, currency: ["CNY"]
	}
])

以 name 这个字段创建一个单键索引
```
db.accountsWithIndex.createIndex(
	{
		name: 1
	}
)
```
- 1: 表示整序排列
- -1: 表示倒叙排列
输出：
```
{
	"createdCollectionAutomatically" : false,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1
}
```
- numIndexesBefore 创建当前索引前共有多少个索引，_id 为默认索引，所以这里是 1
- numIndexesAfter: 创建当前索引后共有多少个索引

查看当前集合索引

db.accountsWithIndex.getIndexes()

输出：

[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_"
	},
	{
		"v" : 2,
		"key" : {
			"name" : 1
		},
		"name" : "name_1"
	}
]

创建一个由 name 和 balance 组合起来的复合键索引

db.accountsWithIndex.createIndex({
	name: 1,
	balance: -1
})

输出：

{
	"createdCollectionAutomatically" : false,
	"numIndexesBefore" : 2,
	"numIndexesAfter" : 3,
	"ok" : 1
}

查看当前索引

db.accountsWithIndex.getIndexes()

输出：

[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_"
	},
	{
		"v" : 2,
		"key" : {
			"name" : 1
		},
		"name" : "name_1"
	},
	{
		"v" : 2,
		"key" : {
			"name" : 1,
			"balance" : -1
		},
		"name" : "name_1_balance_-1"
	}
]

如果指定的字段本身就是一个数组的话，那么就会自动创建一个多键索引
```
db.accountsWithIndex.createIndex({
	currency: 1
})
```
数组字段中的每一个元素，都会在多键索引中创建一个键，类似：
```
"AUD" --> {"Bob"}
"CNY" --> {"Bob"}
"GBP" --> {"Alice"}
"USD" --> {"Alice"}
"USD" --> {"Bob"}
```

使用索引名称删除索引

db.accountsWithIndex.dropIndex("name_1")

使用索引定义删除索引

db.accountsWithIndex.dropIndex({"name": 1, "balance": -1})

创建唯一性索引

db.accountsWithIndex.createIndex({"name": 1}, {unique: true})

输出：

{
	"ok" : 0,
	"errmsg" : "Index build failed: 4091b2ed-10d2-4ae4-a44b-3f38de41d358: Collection test.accountsWithIndex ( 7e171965-6ce1-4854-8e9e-b84bcdd67829 ) :: caused by :: E11000 duplicate key error collection: test.accountsWithIndex index: name_1 dup key: { name: \"Bob\" }",
	"code" : 11000,
	"codeName" : "DuplicateKey",
	"keyPattern" : {
		"name" : 1
	},
	"keyValue" : {
		"name" : "Bob"
	}
}

这是因为我们之前的 name 这个字段存在了两个文档重复了，不符合唯一性。

创建稀疏索引，只将包含索引键字段的文档加入到索引中（即使索引键字段值为 null），如果该文档不包含索引键字段，则可以插入成功，但是没有加入到索引中。
```
db.accountsWithIndex.createIndex(
	{
		balance: 1
	},
	{
		sparse: true
	}
)
```
在 lastAccess 字段上创建一个生存时间是二十秒的索引
```
db.accountsWithIndex.createIndex(
	{
		lastAccess: 1
	},
	{
		expireAfterSeconds: 20
	}
)
```
该索引创建后，会检查 accountsWithIndex 集合当中的文档，只要他们当中的 lastAccess 字段表示的时间距离当前时间超过 20s 的话，就会认为这些文档已经过期了，自动删除。

# 3. MongoDB 执行计划

db.collection.explain(<verbose>).<method(...)>

其中 verbose 表示执行计划的输出模式，有三种：

模式	说明
queryPlanner	执行计划的详细信息，包括查询计划、集合信息、查询条件、最佳执行计划、查询方式和 MongoDB 服务信息等
exectionStats	最佳执行计划的执行情况和被拒绝的计划等信息
allPlansExecution	选择并执行最佳执行计划，并返回最佳执行计划和其他执行计划的执行情况

可以用 explain() 进行分析的命令包括：

aggregate()
count()
distinct()
find()
group()
remove()
update()

# 3.1 queryPlanner 字段解释

字段名称	说明
plannerVersion	执行计划的版本
namespace	查询的集合
indexFilterSet	是否使用索引
parsedQuery	查询条件
winningPlan	最终选择的执行计划
winningPlan.stage	查询方式
winningPlan.inputStage	用来描述子 stage，并且为其父 stage 提供文档和索引关键字
winningPlan.inputStage.stage	子查询方式
winningPlan.inputStage.keyPattern	所扫描的 index 内容
winningPlan.inputStage.indexName	索引名
winningPlan.inputStage.isMultiKey	是否是 Multikey。如果索引建立在 array 上，将是 true
winningPlan..inputStage.direction	查询顺序
filter	过滤条件
rejectedPlans	拒绝的执行计划
serverInfo	MongoDB 的服务器信息

# 3.2 exectionStats 字段解释

queryPlanner 有的，exectionStats 都有，另外它还有：

字段名称	说明
executionStats.executionSuccess	是否执行成功
executionStats.nReturned	返回的个数
executionStats.executionTimeMillis	这条语句执行时间
executionStats.executionStages.executionTimeMillisEstimate	检索文档获取数据的时间
executionStats.executionStages.inputStage.executionTimeMillisEstimate	扫描获取数据的时间
executionStats.totalKeysExamined	索引扫描次数
executionStats.totalDocsExamined	文档扫描次数
executionStats.executionStages.isEOF	是否到达 steam 结尾，1 或者 true 代表已到达结尾
executionStats.executionStages.works	工作单元数，一个查询会分解成小的工作单元
executionStats.executionStages.advanced	优先返回的结果数
executionStats.executionStages.docsExamined	文档检查数

# 3.3 allPlansExecution 字段解释

allPlansExecution 返回的信息包含 executionStats 模式的内容，且包含 “allPlansExecution” : [ ] 块：

"allPlansExecution" : [
      {
         "nReturned" : <int>,
         "executionTimeMillisEstimate" : <int>,
         "totalKeysExamined" : <int>,
         "totalDocsExamined" :<int>,
         "executionStages" : {
            "stage" : <STAGEA>,
            "nReturned" : <int>,
            "executionTimeMillisEstimate" : <int>,
            ...
            }
         }
      },
      ...
]

# 3.4 stage 状态

状态名	说明
COLLSCAN	全表扫描
IXSCAN	索引扫描
FETCH	根据索引检索指定文档
SHARD_MERGE	将各个分片返回数据进行合并
SORT	在内存中进行了排序
LIMIT	使用 limit 限制返回数
SKIP	使用 skip 进行跳过
IDHACK	对 _id 进行排序
SHARDING_FILTER	通过 mongos 对分片数据进行查询
COUNTSCAN	不使用 index 进行 count
COUNT_SCAN	使用 index 进行 count
SUBPLA	未使用到索引的 $or 查询时返回
TEXT	使用全文索引进行查询
PROJECTION	限定返回字段

执行计划的返回结果中尽量不要出现以下stage:

COLLSCAN: 全表扫描
SORT: 使用 sort 但是无 index
不合理的 SKIP
SUBPLA: 未用到 index 的 $or
COUNTSCAN: 不使用 index 进行count

# 3.5 实操

用 balance 进行查询，注意我们未单独给 balance 这个字段建立索引

db.accountsWithIndex.explain().find({"balance": 50})

输出：

{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "test.accountsWithIndex",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"balance" : {
				"$eq" : 50
			}
		},
		"queryHash" : "88DDD986",
		"planCacheKey" : "9238DC63",
		"winningPlan" : {
			"stage" : "COLLSCAN",
			"filter" : {
				"balance" : {
					"$eq" : 50
				}
			},
			"direction" : "forward"
		},
		"rejectedPlans" : [ ]
	},
	"serverInfo" : {
		"host" : "66fa73847542",
		"port" : 27017,
		"version" : "4.4.10",
		"gitVersion" : "58971da1ef93435a9f62bf4708a81713def6e88c"
	},
	"ok" : 1
}

可以看到此时的 stage 是 COLLSCAN 全表扫描。

使用 name 来进行查询，name 我们有建索引

db.accountsWithIndex.explain().find( {"name": "Bob"} )

输出：

{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "test.accountsWithIndex",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"name" : {
				"$eq" : "Bob"
			}
		},
		"queryHash" : "01AEE5EC",
		"planCacheKey" : "0BE5F32C",
		"winningPlan" : {
			"stage" : "FETCH",
			"inputStage" : {
				"stage" : "IXSCAN",
				"keyPattern" : {
					"name" : 1
				},
				"indexName" : "name_1",
				"isMultiKey" : false,
				"multiKeyPaths" : {
					"name" : [ ]
				},
				"isUnique" : false,
				"isSparse" : false,
				"isPartial" : false,
				"indexVersion" : 2,
				"direction" : "forward",
				"indexBounds" : {
					"name" : [
						"[\"Bob\", \"Bob\"]"
					]
				}
			}
		},
		"rejectedPlans" : [
			{
				"stage" : "FETCH",
				"inputStage" : {
					"stage" : "IXSCAN",
					"keyPattern" : {
						"name" : 1,
						"balance" : -1
					},
					"indexName" : "name_1_balance_-1",
					"isMultiKey" : false,
					"multiKeyPaths" : {
						"name" : [ ],
						"balance" : [ ]
					},
					"isUnique" : false,
					"isSparse" : false,
					"isPartial" : false,
					"indexVersion" : 2,
					"direction" : "forward",
					"indexBounds" : {
						"name" : [
							"[\"Bob\", \"Bob\"]"
						],
						"balance" : [
							"[MaxKey, MinKey]"
						]
					}
				}
			}
		]
	},
	"serverInfo" : {
		"host" : "66fa73847542",
		"port" : 27017,
		"version" : "4.4.10",
		"gitVersion" : "58971da1ef93435a9f62bf4708a81713def6e88c"
	},
	"ok" : 1
}

可以看到 stage 是 FETCH 使用索引检索指定文档。而 inputStage.stage 是 IXSCAN，说明子查询先根据索引来查询数据（IXSCAN），得到数据的地址，然后传给父查询，父查询使用索引检索指定文档（FETCH）。

← 四、MongoDB 聚合六、MongoDB 复制集 →